Knowledge Sharing Session “Explore Data with R” – AdIns, Software for Multifinance
On Thursday, June 28, 2018 – Located at AdIns Theater room, Knowledge Sharing Session (KSS) was again held successfully. This time, Gishella Erdyaning R was present as the speaker with participants from various business units. Gishella is one of Immatic’s CONFINS developers in AdIns. AdIns as a software vendor multifinance has one weekly activites called KSS to share a new or updated knowledge to its internal human source. With an updated news from software field, this KSS themed “Explore Data With R” with the title “R as a tool for data exploration and data visualization”. Here is the content of the material that has been summarized :
The advancement in the world of technology is running so rapid today. Data Science is a hot topic to be discussed and applied in technology. Data Science is a medium to process Big Data. Big Data itself is a general term for a large and complex data set that make it difficult to handle or process, if only using an ordinary database management or traditional data processing application.
If we are able to process data effectively, we can get whatever information we need. That’s why we need Data Science. In general, there are several basic elements in this Data Science :
- Data Engineering, is a tool to clean raw data from Big Data for Data Analyst to be easily analyze all the data
- Data Analyst, is the person who analyzes the data that have been cleaned to produce the needed information
- Data Visualization, is the process of visualization or the process of beautifying the processed data that have been obtained. Generally, the data visualization is shown to company executives (decision makers)
- Machine Learning, is the process of modeling or create a scheme for the information that have been analyzed. Prediction model, is a model commonly used in maching learning for business and economics. It can also be used to make any diagnoses in medicine and health.
To be able to visualize the data, we need a tool. The used tools can be various including HP Big Data, Lumify, Map Reduce, HPCC Systems Big Data, Storm and many more. But the tools that are popular nowadays for Data Science is 2 R and Phyton. Both are equally a favorite with the features they have, depending on the user’s preferences only. For those who are accustomed to seeing statistics generally will prefer to use tools R. R as a tool for data exploration and data visualization.
In the programming language, R also provides an IDE (Integrated Environment Department) named RStudio. IDE is a software application providing comprehensive facilities for computer programmers to process the software development. While Rstudio make a friendly interface tools to use by users.
Tools R is very flexible. We can generate any attractive data visualization, can also perform data mining process. Data mining in one of the definitions is an interesting process of finding patterns from large amounts of stored data. The advantages of tools R is Open Source, that can be taken or downloaded for free from the internet, the software code published to the public or internet users. Generally people will fix the weaknesses of the software and modify the look or language. There are a few active community that develop its libraries. The libraries can be used by the user to create a cool and attractive visualization and also accurate data modeling.
R is also included as a high-level language language so the syntax is easy to understand because it is almost similar to human language. Syntax is the rule of writing sentences to be properly understood by programming languages.
Currently, people outside of the IT world have also been interested to learn Data Science. There are also a number of organizations that have hold various workshops with the theme of this Data Science.