HomePublicationsInsightsBig Data: how to deal with the diversity of formats?

Big Data: how to deal with the diversity of formats?

At the moment, it is taking place in São Paulo the Campus Party Brazil 2016, a technology event that brings together communities with interests as diverse as entertainment and development of tools and use of the internet for social transformation. It is considered the most important technology event in the country, providing an environment for exchanging knowledge and disseminating innovations. Reading about the event, I was reminded of a recent video, published by TEDx New York, where data scientist Ben Wellington talks about the potential of using Big Data for social transformation and the need for some standardization in the data format to take full advantage of it. of the available information.

Video 1 – How we found the worst parking spot in New York using Big Data

Source: TEDx New York

 

In Video 1, Ben cites numerous examples of insights which he obtained from data made available by New York city halls, within a visibility and Open Data project initiated by Mayor Bloomberg. However, he criticizes the lack of data standardization and the excessive use of the Portable Document Format (pdf) extension in disclosing information that could be made available in Excel or in the Comma-Separated Values ​​(csv) format, which makes it difficult to extract and information analysis.

This is, without a doubt, a great challenge for the use of Big Data in business decision-making. Despite being available, data only become valuable when transformed into relevant information and made available to decision makers. There are therefore three major challenges when dealing with the complexity of information flows, which requires the analysis and interpretation of an ever-increasing amount of information (volume), coming from different sources and in different formats (variety) and making it available practically in real-time for a large number of stakeholders (speed).

Among these challenges, today the main difficulty seems to be working and crossing data in the most different formats, such as texts, database, spreadsheets, audio, video, financial transactions, meter and sensor records, among others. Much of this data is not in numeric format, which requires new and sophisticated analysis tools, with few companies able to use them consistently. This reinforces Ben's appeal to advance in the construction of rules for the standardization of information and the wish that the Campus Party is a success and new search and information analysis robots can be created to help us move forward. in the use of Big Data.

 

Reference

<https://www.ted.com/talks/ben_wellington_how_we_found_the_worst_place_to_park_in_new_york_city_using_big_data?language=pt-br>

<https://pt.wikipedia.org/wiki/Campus_Party_Brasil>

HILBERT, M.; LOPEZ, P. The World's Technological Capacity to Store, Communicate, and Compute Information. Science Magazine, vol. 332, no. 6025, p. 60–65, 2011.

MANYIKA, J.; CHUI, M.; BROWN, B.; et al. Big data: The next frontier for innovation, competition and Productivity. McKinsey Global Institute, pages 1–137, 2011.

 

https://ilos.com.br

Executive Partner of ILOS. Graduated in Production Engineering from EE/UFRJ, Master in Business Administration from COPPEAD/UFRJ with extension at EM Lyon, France, and PhD in Production Engineering from COPPE/UFRJ. He has several articles published in periodicals and specialized magazines, being one of the authors of the book: “Sales Forecast: Organizational Processes & Qualitative and Quantitative Methods”. His research areas are: Demand Planning, Customer Service in the Logistics Process and Operations Planning. He worked for 8 years at CEL-COPPEAD / UFRJ, helping to organize the Logistics Teaching area. In consultancy, he carried out several projects in the logistics area, such as Diagnosis and Master Plan, Sales Forecast, Inventory Management, Demand Planning and Training Plan in companies such as Abbott, Braskem, Nitriflex, Petrobras, Promon IP, Vale, Natura, Jequití, among others. As a professor, he taught classes at companies such as Coca-Cola, Souza Cruz, ThyssenKrupp, Votorantim, Carrefour, Petrobras, Vale, Via Varejo, Furukawa, Monsanto, Natura, Ambev, BR Distribuidora, ABM, International Paper, Pepsico, Boehringer, Metrô Rio , Novelis, Sony, GVT, SBF, Silimed, Bettanin, Caramuru, CSN, Libra, Schlumberger, Schneider, FCA, Boticário, Usiminas, Bayer, ESG, Kimberly Clark and Transpetro, among others.

Sign up and receive exclusive content and market updates

Stay informed about the latest trends and technologies in Logistics and Supply Chain

Rio de Janeiro

TV. do Ouvidor, 5, sl 1301
Centro, Rio de Janeiro - RJ
ZIP CODE: 20040-040
Phone: (21) 3445.3000

São Paulo

Alameda Santos, 200 – CJ 102
Cerqueira Cesar, Sao Paulo – SP
ZIP CODE: 01419-002
Phone: (11) 3847.1909

CNPJ: 07.639.095/0001-37 | Corporate name: ILOS/LGSC – INSTITUTO DE LOGISTICA E SUPPLY CHAIN ​​LTDA

© All rights reserved by ILOS – Developed by Design C22