Methods for analyzing great data. Encyclopedia of Marketing

The steadily accelerating growth of these obligations is an invisible element of current realities. Social networks, mobile devices, data from digital devices, business information – there are just a few types of devices that generate gigantic amounts of data.

Nowadays, the term Big Data has become broader. Not everyone is aware of the extent to which the speed and depth of technology for processing large amounts of data changes various aspects of married life. Changes are being made in various areas, giving rise to new problems and discussions, including in the area of ​​information security, where such important aspects as confidentiality are at the forefront. Availability, availability, etc.

Unfortunately, many companies today are delving into Big Data technology, which does not create a reliable infrastructure that could ensure the reliable storage of large amounts of data that they collect and save. On the other hand, blockchain technology is rapidly developing at this time, which poses many other problems.

What is Big Data?

In essence, the meaning of the term lies on the surface: “great data” means the management of even large amounts of data, as well as their analysis. How to marvel more broadly at the information that can be processed in classical ways through their great obligations.

The term Big Data itself (great tributes) has been coined recently. According to data from the Google Trends service, the popularity of the term began to grow rapidly at the end of 2011:

In 2010, the first products and solutions directly related to the collection of great data have already begun to appear. Until 2011, most of the largest IT companies, including IBM, Oracle, Microsoft and Hewlett-Packard, actively used the term Big Data in their business strategies. Increasingly, analysts of the information technology market are beginning to actively investigate this concept.

Currently, this term has gained significant popularity and is actively being used in various fields. It is impossible to say with certainty that Big Data is a fundamentally new phenomenon - however, great data has a lot of potential. In marketing, they can be called databases of customer purchases, credit histories, lifestyle, etc. For many years, analysts have used these data to help companies predict the future needs of customers, e.g. Insulate the vests, formulate the lasting advantages, etc.

At this time the situation has changed in two aspects:

- More sophisticated tools and methods have appeared for analyzing and compiling different sets of data;
— Analysis tools have been supplemented with a number of new sources of data, with the widespread transition to digital technologies, and new methods of collecting and sharing data.

Researchers predict that Big Data technologies will be more actively used in manufacturing, health care, trade, government, and in other various areas and industries.

Big Data is a chain of data, and a set of methods for processing it. The primary characteristic of great data is not its limitations, but also other categories that characterize the laborious processes of processing and analyzing data.

The output data for processing may include, for example:

- Logs of Internet users’ behavior;
- Internet of speeches;
- social media;
- Meteorological data;
- Digitized books from the largest libraries;
- GPS signals from transport facilities;
- information about transactions of bank clients;
- Data about the location of mobile network subscribers;
- information about purchases from great retail outlets, etc.

Over the years, the number of data sources is constantly growing, and against this background, new and obvious methods of processing information appear.

Basic principles of Big Data:

- Horizontal scaling - data sets can be large, which means that the system for processing large data can dynamically expand due to the increase in its responsibilities.
- Visibility - if there is a failure of certain elements of the system, the entire system may lose its usefulness.
- Locality of data. In large divisional systems, data is distributed over a significant number of machines. However, in order to save resources, data is often processed on the same server as it is saved.

For stable operation of all three principles and, obviously, high efficiency of saving and processing of large data, new innovative technologies are necessary, such as, for example, blockchain.

Are great tributes needed?

The scope of Big Data is steadily expanding:

— Great tributes can be obtained from medicine. Thus, a diagnosis of a patient can be made not only by analyzing the history of the disease, but also by relying on the evidence of other doctors, information about the environmental situation in the patient’s area of ​​residence and a lot of other factors.
— Big Data technologies can be used to organize the development of unmanned transport.
— Having collected large amounts of data, one can recognize the evidence in photographs and video materials.
— Big Data technologies can be used by retailers - trading companies can actively mine data sets from social measures to effectively set up their advertising campaigns, which can be maximally oriented according to that Another survivable segment.
— This technology is actively being explored in the organization of election campaigns, including for the analysis of political similarities in marriages.
— The latest Big Data technology is relevant for revenue assurance (RA) solutions, which include tools for identifying inconsistencies and comprehensive data analysis, which allows you to quickly identify your financial expenses and generate information This will lead to a decrease in financial results.
— Telecommunications providers can aggregate large amounts of data based on geolocation; In fact, this information may be of commercial interest for advertising agencies, who may use them to display targeted and local advertising, as well as for retailers and banks.
— Great data can play an important role in the top opening of a retail outlet at a particular location based on data about the presence of a strong target flow of people.

Thus, the most obvious practical application of Big Data technology lies in the field of marketing. With the recent development of the Internet and the proliferation of various communication devices, behavioral data (such as the number of calls, shopping alerts and purchases) become available in real time.

Big data technologies can also be effectively used in finance, sociological research and many other areas. Experts confirm that all the possibilities of the discovery of great data are just the visible part of the iceberg, while fragments of much larger fields of technology are being investigated by intelligence and counterintelligence, by military authorities, as well as everything that is usually called information wars.

The whole process of working with Big Data consists of collecting data, structuring the extracted information using data and dashboards, and then formulating recommendations before action.

Let’s take a brief look at the possibilities of using Big Data technologies in marketing. Apparently, for a marketer, information is the main tool for forecasting and developing a strategy. The analysis of great data has long been successful in identifying the importance of the target audience, interests, and consumer activity of residents. Analysis of great data allows you to display advertising (based on the RTB auction model - Real Time Bidding) to those partners who have accumulated their services.

The use of Big Data in marketing allows businessmen to:

— learn more about your companions, gain a similar audience on the Internet;
- Assess the level of customer satisfaction;
- Understand what the service needs are based on;
- Find and implement new ways to increase customer confidence;
- create projects that are likely to happen.

For example, the Google.trends service can provide a marketer with a forecast of seasonal activity for a specific product, the number and geography of clicks. If you compare the data with the statistical data that is collected by a suitable plugin on the official site, you can create a plan based on the division of the advertising budget from the designated month, region, and other parameters.

The success of Trump's election campaign is believed by many followers to be based on the segmentation and research of Big Data. The team of the upcoming US President was able to properly divide the audience, understand their message and show the very message that the election officials want to convey. Thus, according to Irina Belisheva from the Data-Centric Alliance company, Trump’s victory was largely due to a non-standard approach to Internet marketing, which was based on Big Data, psychological and behavioral analysis and personalized advertising.

Trump's political strategists and marketers developed a specially developed mathematical model that allowed them to deeply analyze the data of all US voters and systematize them, developing precise targets not only for geographical features, but also for people, etc. the resources of the voters, their psychotype, behavioral characteristics, etc. of which marketers organized personalized communication with the skin of a group of people based on their needs, moods, political views, psychological characteristics and create a skin color that is practical for the skin’s choice of 2nd message.

As Hillary Clinton complains, her campaign championed “over-the-clock” methods based on sociological data and standard marketing, dividing the electorate into formally homogeneous groups (men, women, African Americans, Latinos and, poor, rich, etc.). .

As a result, the winner is the one who has properly assessed the potential of new technologies and analysis methods. It is noteworthy that Hillary Clinton spent twice as much on her election campaign as her opponent:

Dani: Pew Research

The main problems of the Big Data research

However, one of the main factors driving the development of Big Data in various areas is the problem of choosing the data that is collected: the importance of what data needs to be obtained, saved and analyzed and, as for me, don’t take it to the point of respect.

Another problem with Big Data is of an ethical nature. This is where the natural diet comes into play: how can such a collection of data (especially without the knowledge of the banker) help to address the disruptions between private life?

It’s no secret that the information stored in Google and Yandex search systems allows IT giants to continuously improve their services, make them more accessible to customers and create new interactive programs. . For which search engines collect merchant data about the merchants’ activity on the Internet, IP addresses, geolocation data, interests and online purchases, special data, postal notifications, etc. All this allows you to display contextual advertising depending on the customer’s behavior on the Internet. In this case, you will not be able to eat at all, and you will not be able to choose what information you want to tell about yourself. Then Big Data collects everything that is then stored on the servers of these sites.

This raises the important problem of ensuring the safety of saving and storing data. For example, is it safe to use another analytical platform that allows people to automatically transmit their data? In addition, many business representatives are aware of the shortage of highly qualified analysts and marketers who can effectively manage large amounts of data and monitor them for specific business tasks.

Regardless of all the difficulties with Big Data research, businesses intend to increase their investment in this area. According to Gartner's research, the leaders investing in Big Data include media, retail, telecom, banking and service companies.

Prospects for interaction between blockchain and Big Data technologies

Integration with Big Data has a synergistic effect and opens up a wide range of new opportunities for business, including allowing:

— deny access to detailed information about current products, on the basis of which it is possible to generate reporting analytical profiles for specific deliveries, goods and product components;
— integrate reporting data on transactions and statistics on the growth of different groups of goods by different categories of customers;
— generate analytical reports on lancet production and growth, control the waste of products during transportation (for example, the waste of drying and vaporization of certain types of goods);
- counteract counterfeiting of products, promote the effectiveness of the fight against money grabs and fraud, etc.

Access to reporting data on the growth and growth of goods significantly reveals the potential of Big Data technology to optimize key business processes, reduce regulatory risks, discover new monetization opportunities and create products ї, which is as similar as possible to current living analogues.

Apparently, representatives of the largest financial institutions, including, etc., are already showing significant interest in blockchain technology. .

The potential for blockchain analysis with the help of Big Data technology is great. The technology of a shared registry ensures the integrity of information, and also reliably saves the entire transaction history. Big Data, in its turn, provides new tools for effective analysis, forecasting, cost-effective modeling and, obviously, opens up new opportunities for more important management decisions.

The tandem of blockchain and Big Data can be successfully used to protect health. Apparently, insufficient data on a healthy patient significantly increases the risk of an incorrect diagnosis and incorrectly prescribed treatment. Critically important data about the health of clients of medical institutions must be maximally stolen, carried out by the authorities of permanence, verified and innocently be sensitive to any manipulation.

Information in the blockchain provides all kinds of benefits and can serve as clear and reliable output data for in-depth analysis using new Big Data technologies. Moreover, with the help of blockchain, medical institutions would be able to exchange reliable data with insurance companies, justice authorities, robots, scientific institutions and other organizations that require medical information.

Big Data and information security

In a wide range of applications, information security protects information and supporting infrastructure from sporadic and spontaneous negative outbursts of a natural or artificial nature.

In the field of information security, Big Data faces the following queries:

— problems of data protection and security of their integrity;
- the risk of third-party supply and flow of confidential information;
- Improper storage of confidential information;
- the risk of losing information, for example, as a result of malicious actions;
- the risk of non-purposeful vikoristannya of personal data by third persons.

One of the main problems of big data, as blockchain may be, lies in the field of information security. By ensuring that all the basic principles are adhered to, the shared registry technology can guarantee the integrity and reliability of the data, and always provide stable operation due to the presence of a single point of view on the blockchain information systems. Shared registry technology can help solve the problem of data trust, as well as make it possible to exchange them universally.

Information is a valuable asset, which means that ensuring the basic aspects of information security comes first. In order to stay ahead of the competition, companies must move with the times, which means they cannot ignore the potential opportunities and advantages that blockchain technology and tools can bring. Big Data.

We regularly come across fashionable words and significance, the sense of which we intuitively understand, but we don’t have a clear picture of what this thing is and how it works.

One of these is Big Data. In Russian language, you can get the literal translation - “great data”, but more often people say and write it as: Big Data. Everyone, in a melodious manner, has heard this word on the Internet, and it’s difficult, but what exactly matters is respected, far from the subtle digital world of office humanists in sight never mind.

The only attempt to fill this gap in the mud of the widest stake of koristuvachs is the article of one of our favorite authors Bernard Marr, as it is called What is Big Data? Super simple explanation for the skin". Without sophisticated jargon in a single way, explaining the key ideas of this phenomenon for the skin is not necessary to illuminate that area of ​​activity.

In fact, the remaining few of us already live in a world thoroughly permeated with Big Data, but we continue to get lost in the understanding of what is still the same. It is partly true that the concept of Big Data itself is constantly being transformed and re-interpreted, since the world of high technologies and the processing of large amounts of information is rapidly changing, including all new and new options. And the demand for this information is constantly growing.

So, what does Big Data - 2017 mean?

It all started from Vibukh’s growth of a large amount of data that we create from the beginning of a digital series. This has become possible mainly due to the increasing number and complexity of computers, the expansion of the Internet and the development of technologies that allow us to capture information from the real, physical world in which we all live, and convert it into digital data.

In 2017, we generate data when we go to the Internet, when we use our GPS-equipped smartphones, when we connect with friends on social networks, when we enjoy mobile programs or music, when we buy.

We can say that we are depriving ourselves of anonymous digital traces, so that we would not be bothered, since our activities include any digital transactions. That may happen forever and ever.

Moreover, the volume of data generated by the machines themselves is growing at a rapid rate. Data is created and transmitted when our intelligent accessories communicate one with another. Viral enterprises around the world are equipped with devices that collect and transmit data day and night.

In the near future, our streets will be filled with self-driving cars that independently plot routes based on maps from around the world, data generated in real time.

What can we do with Big Data?

The endlessly growing flow of sensory information, photographs, text messages, audio and video data lies at the heart of Big Data, which we can analyze in such a way as it was impossible to identify many of the reasons for this.

Projects based on Big Data were immediately launched to help:

Treat illnesses and save cancer. Based on the science of Big Data, medicine analyzes a large number of medical records and images, which allows for early diagnosis and facilitates the creation of new treatment methods.

Fighting hunger. The rural kingdom is experiencing the current Big Data revolution, which allows the use of resources in such a way as to maximize the yield for a minimum input into the ecosystem and optimize the use of machines and ownership.

Reveal distant planets. NASA, for example, analyzes a lot of data and comes up with a model of future missions in distant worlds.

Transfer superordinate situations of different nature and minimize possible damage. Data from numerical sensors can be transferred whenever there is an attack and the possible behavior of people in an emergency situation that increases the chances of survival.

Avoid evildoers for the use of new technologies that allow for more efficient distribution of resources and directing them where they are most needed.

And for most of us: Big Data is making a difference in the lives of everyday people, both in the simple and the simple - through online shopping, planning trips, and navigating the minds of the metropolis.

Finding the best time to buy air tickets and choosing which movie or series to watch has become much easier with Big Data robots.

How does this work?

Big Data works on the principle: the more you know about something, the more accurately you can tell what will happen next. The leveling of nearby data and the connections between them (we are talking about the enormous number of data and the incredibly large number of possible connections between them) allows one to identify patterns earlier. This makes it possible to get to the bottom of the problem and to understand how we can deal with this or any other process.

Most often, the process of processing large quantities of information involves running models based on the collected data and running simulations, during which key adjustments gradually change, during which the system monitors how “change adjustment” leads to a possible result.

This process is completely automated, including the analysis of millions of simulations, the selection of all possible options until the moment when the pattern (required circuit) is not found or until there is “enlightenment” that will help the virus Look for the secret for which everything began.

In addition to the objects that are familiar to us, the data are taken in an unstructured form, so that they are difficult to put into tables with middles and stoppers that are familiar to us, people. A large amount of data is transmitted as images and videos: from satellite photos to selfies that you post on Instagram or Facebook, as well as entries in email and instant messenger or telephone calls.

In order to give a practical place to every unscrambled and heterogeneous flow of data, Big Data often uses the latest analysis technologies, which include artificial intelligence and machine learning (when a program on a computer begins and programs).

Computers themselves are beginning to understand what other information represents, for example, recognizing images, language, and they can work much more quickly than people.

Great brother?

In proportion to the unprecedented capabilities that today's Big Data gives us, the number of benefits associated with it is growing.

Lack of specific data. Big Data collects a large amount of information about our private lives. There is a lot of information that we would like to save in the dungeon.

SAFETY. Do we think that there is nothing terrible in transferring all our personal data to a machine for the sake of some specific, visible mark, but can we hope that our data is kept in a safe place?
Who and how can guarantee this for us?

DISCRIMINATION. If everything is known, is it possible to discriminate against people based on what Big Data knows about them? Banks check your credit history, and insurance companies check your car insurance rates based on what they know about you. How far can you go?

It can be assumed that in order to minimize risks, companies, government agencies and private individuals will protect those who can find out about us, and for any reason will limit our access to resources. information.

For all our successes, we can recognize that everything is also about the unknown part of Big Data. Until now, people have been scratching their heads over the news for the past few days, until the time has come, when the sickness has reached the point of business, which wants to use the advantages of Big Data for its own purposes. But this can threaten with catastrophic consequences.

Whatever the activity of a merchant on the Internet is no longer hidden behind these seals. You can integrate literally everything - from online purchases to likes - based on the concept of Big Data. The result is that you learn more about your target audience and create personalized propositions. More precisely, the machine will do everything for you: it will analyze it and make the best decision.

Tell me, what is science fiction? Naturally, the mechanism has not yet expanded so much, especially in Russia, and there have been no improvements, but the first crumbs on the road have definitely been crushed.

When it comes to great tributes, what is important is not how many you collected them, but how you take them. Vzagali Big Data is a universal technique. In this article we look at the stagnation of marketing and sales.

What is Big Data?

Great transport companies, online stores, telecom providers, SaaS services, banks - in a word, companies with a large customer base rely heavily on information.

This includes personal data (name, email, phone number, location, age, geography), as well as IP addresses, hours of access to the site, number of visits, entries on the site, purchase history, etc. Each company has its own specifics and its own unique data, which are available only to them.

For example, the taxi service “knows” every second that the driver checks at the train. Online banking service - for what, if and with what amount you paid. Online store - I was amazed at how the goods were, pecked at the cats and added to the chosen one.

This is not the same data that the CRM system accumulates for every business. This is all that a company can know about clients, and can receive terabytes of information in a variety of ways. The original basis is impossible to process such obligations. I would like the data to change regularly and arrive vertically (+ new client) and horizontally (+ additional information about the client).

In addition, the smells are diverse and unstructured, and the fragments are presented in completely different units, for example:

  • Blogs and social media;
  • Audio and video files;
  • Corporate databases;
  • Sensors, visualization devices and sensory measurements.

This is Big Data. As abstract documents are less physical, people cannot handle them. Machine algorithms come to the rescue.

Data Mining or how great data is collected and collected

Do the stars take great tribute?

First of all, your website and all the storage points for contact information.

In other words, doctors and analytics systems (Yandex.Metrica, Google Analytics).

How are great tributes collected? Axis of main decisions in the Big Data market:

Database management systems (Sap, Oracle, Microsoft, IBM and others), which save and process information, analyze the dynamics of indicators and produce results from statistical data;

  • Services for managing the purchase of RTB advertising, which transfer data to targeted clients and target advertising in online channels (for example, Segmento, RTB-Media);
  • Product recommendation services that display products on the website as useful as possible for a specific customer (RetailRocket, 1C-Bitrix BigData);
  • Content personalization services that show users the most relevant versions of resource pages (Personyze, Monoloop, Crosss);
  • Services for personalizing outlets that add information to the lists that are being targeted (for example, Vero, Personyze);

These systems actively interact with each other, strengthening and updating functionality.

How does Big Data technology work and what is Data Science?

The practical essence of this approach is to minimize the harassment of people before the process of making a decision. This is the basis for the concept of Data Science (literally, “the science of data”).

Based on this concept, the statistical model is supported by great data. It is important to identify the relationships in the data and as accurately as possible (due to objectivity and a wide range of data) conveys the behavior of a particular customer - whether he will add a product, whether he will subscribe to the delivery service, or whether he will click on an article.

In someone there is a continuous process of self-initiation. Then the machine itself begins (the principle of Machine Learning) in real time and creates algorithms to optimize business processes.

Vaughn independently means and suggests:

  • What, where and if to demonstrate the maximum conversion rate;
  • How to increase cross-sales and additional sales;
  • What are the most popular products and why;
  • How to paint your product/service for your target audience.

Retail machines can accept the following solutions:

  • De open the approaching store;
  • What marketing campaigns to carry out;
  • How to predict sales for the upcoming period;
  • How to see the “core” of the audience;
  • How much to move/change prices in the coming month;
  • How to optimize your marketing budget;
  • How to identify clients for the coming month.

In marketing, this allows you to segment a target audience, develop creativity and personal propositions for each segment. Unfortunately, this process of automation is no longer common.

Axle is your butt.

The Target company has undertaken an unconventional task - to target female women before they introduce thematic queries, share new products in social media or otherwise advertise the product on the Internet.

How did that get away? The knowledge about the purchase signs helped. And Target itself has discovered during the course of research that new mothers are buying a lot of unscented lotion, honeycombs and terry washcloths.

Another butt.

The Russian e-book service Bookmate has little knowledge of the real interests of its contributors. The stinks came at the appendices, and when the books were presented they were tacked on. The situation became clearer thanks to the latest information from social media. The number of recommendations has grown by 2.17 times, and the conversion rate among merchants who pay has increased by 1.4 times.

British Airways has taken personalization to a whole new level. As part of the Know Me program, it recognizes the appearance of clients using Google Images. The staff greets passengers at airport terminals and on board the aircraft and especially attends to them.

In addition, the personal information of passengers about previous flights allows the airline to recover from the fact that a previous flight was damaged or their luggage was lost.

This other information about the base (for example, flights in Hedgehog) is available to British Airways flight attendants on special working tablets.

Big Data in e-commerce: the case of Netologiya

Meta - optimize marketing communications for 3 online cosmetics stores and monitor the assortment of over 500 products.

What did the specialists of “Netology” do for?

We started collecting all available data about the ongoing behavior of the client base - close to 100 thousand users - from the popular e-commerce systems Magento and Shopify.

  • Information about purchases, cats, average bill, check-in time, etc.;
  • Returns from prepayers of email unsubscribers: data about the creation of lists and transitions for messages from services such as Mailchimp and Dotmailer, as well as about further activity on the site (review of product cards, categories, purchases after unsubscribe);
  • The activity of repeat contacting regular customers for information about reviewing products before making a purchase.

From these data the following figures were taken:

  • Optimal size of the book;
  • Hour of life of the client and its total value (LTV);
  • Possibility of repeat purchases.

In this way, we create a full-fledged image of the skin client with a unique set of similarities, a symbol and particularities.

Acceptable:

Client A. He buys the same shampoo for his hair. There is no reason to run additional promotions on this product for this customer. Better yet, in a month, buy additional conditioner and a mask from the same brand.

Client B. Having bought eau de toilette and perfume once and then not bought anything else. Prote looks at the outlets of the online store and buys decorative cosmetics. There is confidence that the client will shop in another place. Proposition to a set of shades and a lower price can be a major incentive for this purchase.

Based on this information, the system formed segments for launching campaigns via email and Facebook - over the past period there have been 40 to 100 automated campaigns for the skin brand.

During the collection of data, investigators revealed a low number of triggers. For example, a group of merchants checks the post office in France, and in the evening they go home and buy some goods. Є senses to duplicate the commodity proposition of death through the additional channel.

Result: It was possible to triple repeat sales, increase the open rate of listings to the average by 70%, and conversion per listing by 83%.

“Olyudnennya” of data: Yandex.Taxi case

Yandex.Taxi contains unique data about all trips. Based on them, marketing communications can be more emotional. The main idea is to get along with clients in a friendly way and unobtrusively tell fortunes to yourself. Personal statistics, such as history and character, helped to realize this.

Media façades

Yandex.Taxi marketers have identified the most popular destinations and routes. For this purpose, a large number of preparations were made to the most important places: parks, theaters, museums, monuments. These data are not so personal and do not represent anyone, but rather show how the place is alive.

Such precautions made it possible to implement the idea of ​​a special focus on the audience in the form of media facades. The design was made by a group of friends in the chat. For the skin area there are your own phrases.

The company never exchanges a human phrase, as if they understand nothing. I am happy that this fate is respectable, and Yandex.Taxi is insured for increased brand recognition locally.

When composing the text, the following techniques were used:

  • Local slang is local words that all residents understand. We joked about them in public pages and forums, and also checked with regional managers and local experts. For example, in Kazan the registry office is called “Bowl”, the embankment in Yekaterinburz is called “Drama”;
  • Game of words. Apply axle:

3090 people who traveled to Madrid by taxi. And you know a lot about prices! (“Madrid” - hotel near Ekaterinburz).

958 people who raced to Jupiter. It's just space! (“Jupiter” is the name of the company).

As a test experiment, Yandex is now developing a larger comprehensive campaign from the earnings of various online and offline users.

New videos

For bags 2017, Yandex.Taxi wanted to tell clients how many hours they spent together and accounted for how many trips, cleaning and wound cleaning.

To do this well, we came up with a compelling story for one of the millions of trips and made a video on this topic with numbers and statistics.

It went like this:


764 million people have recovered - the married couple says goodbye to the taxi.


56 million morning trips across the river - mom and daughter go to the morning.


122 thousand trips with animals.

The results of the first trial showed that the videos looked like this, no brand could boast of great numbers. To more accurately convey the message of “look at how many hours we spent together,” the statistics were changed to shift the focus to the characters in the story.


The numbers themselves are not worth talking about. It’s easy to understand whether this number is large or small, and what they wanted to show with it. Yandex is a source of data not as an end in itself, but as a way to tell the story.

Great day for the dodatku

The company also guessed the characteristics for its clients - “taxitypes” - depending on the number of trips, their daily routine and the need for recovery. The mechanism of identification includes three of these characteristics, adding them to the image of the client and subsuming them into one category:


The dates were assessed by this place, where people earned over 70% of the trips.

The algorithm finds the median and then evaluates the metrics - “rich” or “few” trips, short trips and stays.

You can find out your “taxi type” for every driver who has made more than 4 trips per river, in the addendum behind the button:


For example:


Black Puma: Having traveled a lot, short trips, rarely going out at all times


Far-sighted mandrevnik: having driven a lot, for a long time and getting to the car immediately

20% of those who, having looked at it, screened the results and shared them with social networks - twice as much, less felt the forecasts!

Statistics for drivers


Maybutn Big Data

Experiments with great gifts will be continued.

Yandex is one of the pioneering companies that is not only pioneering the concept of Data Science, but also actively pursuing it in the development of power products.

Let's take the Yandex.Zen blogging platform. It is available in various countries. There is no need to sort the material by topics or other parameters and set up display on the specific categories of contributors. Kozhen read the statistics that are useful to you and select a new set of similar ones. The system simply proposes those that are most suitable for you.

On the right is that the machine intelligence of rectification is not average. You will not be able to create a large number of segments, which will allow you to personalize content to each of the billions of customers.

A foreign analogue can be called alexa.com - this is the rating of the most affiliated sites around the world and in different countries around the world (selections in the countries are paid and cost pennies).

Automatic collection of data (through its services, such as Yandex.Browser, etc.) and statistical models allow you to include sites in the list that do not take part in other ratings.

However, the current view makes it possible to identify leaders in various niches and, with the help of other services, model their strategies for promoting and driving traffic.

Let's say you select 5-10 clients - and the machine finds thousands of similar ones and targets them. The advantage of machine intelligence is that it does not include insurance factors, which can be released out of respect, without knowing about them.

  • Let’s begin to recognize which decisions are taken better by humans and by machines, and do not confuse the two classes. Since algorithms can cope with similar tasks (choose a button design) more efficiently, only a few people can be creative (design a website from scratch).
  • Start with people and algorithms;
  • Ensure that you want the algorithms to miraculously confirm the food, but you can’t set the food yourself. If you want, you may want to eat some time.

Before speaking, the theory about the “continuity” of humans and machine intelligence is being violated more and more often. From this drive you can admire the battle between Andriy Sebrant and Anton Bulanov (director of INVITRO, the largest private medical company).

About segmentation, marketers with juices, managing budgets and why you need to use the “Bring me clients” button.

Looks at one dihanni.

Column of contributions from NDU VSHE about myths and cases of robots with great tributes

Researchers from the School of New Media NDU HSE Kostyantin Romanov and Oleksandr Pyatigorsky, who is also the director of digital transformation at Beeline, wrote a column for the site about the headaches of driving great data - applications of modern technologies and tools. The authors assume that the publication will help the company’s managers gain understanding.

Myths and favors about Big Data

Big Data is not marketing

The term Big Data has become even fashionable - it is used in millions of situations and in hundreds of different interpretations, which often have no bearing on what it is. It is often difficult to understand in people's heads that Big Data is confused with a marketing product. Moreover, for some companies Big Data is part of the marketing department. The result of the analysis of great data can effectively serve as a guide for marketing activity, but nothing more. I'm amazed at how it works.

We came up with a list of those who bought goods from our store worth over three thousand rubles two months ago, and then sent these merchants such a proposition, that is, typical marketing. We can clearly see the pattern from the structural data and the increase in sales.

However, if we combine CRM data with streaming information, for example, from Instagram, and analyze it, we find a pattern: people who have reduced their activity on Wednesday evening and who is the remaining photographer There are images of kittens, then create a singing proposition. This will also be Big Data. We knew the trigger, handed it over to marketers, and the stinks of it were used for their own purposes.

This means that the technology depends on unstructured data, and when the data is structured, the system still continues to search for patterns in them, so as not to interfere with marketing.

Big Data is not IT

The other extreme of this story: Big Data is often confused with IT. Therefore, in Russian companies, the IT leaders themselves are the drivers of all technologies, science and great data. Because everything is happening in this department, for the company as a whole there is a threat to IT activity.

In fact, there is a fundamental truth here: Big Data is an activity that is directly related to the product, which does not at all relate to IT, although without them the technology cannot exist.

Big Data - permanent collection and analysis of information

Another trick for Big Data. Everyone understands that this technology is connected with the great obligations of data, and it is not always clear that such data are toil for respect. You can only collect and obtain information from films about , or from any small company. Nutrition lies in what you choose yourself and how you get it for yourself.

It should be understood that Big Data technology will not collect and analyze absolutely any information. For example, if you collect data about a specific person in social networks, it will not be Big Data.

What is Big Data really like?

Big Data consists of three elements:

  • tribute;
  • analytics;
  • technology.

Big Data is not one of many warehouses, but a combination of all three elements. People often understand that what matters is that Big Data is more about data than technology. But the fact is, no matter how much data you collect, you won’t be able to do anything with it without the necessary technologies and analytics. If the analytics are good, but if there is no data, it is even worse.

If we talk about data, then it’s not just texts, but all the photographs that are posted on Instagram, and everything that can be analyzed and analyzed for various purposes and the task. In other words, Data refers to the great obligations of internal and external data of various structures.

Analytics is also needed, because the purpose of Big Data is to identify certain patterns. This type of analytics is the identification of accumulated deposits and the search for new food and evidence based on the analysis of all kinds of different data. Moreover, Big Data should be based on the fact that it cannot be directly derived from such data.

As far as we are talking about images, the fact that you posted your photo in a black football shirt does not say anything. If you are using photography for Big Data modeling, you may realize that you will need to get credit right away, since your social group has a tendency to talk about this phenomenon in children. Therefore, “naked” data is given without analytics, without identifying the presence and non-obvious deposits of Big Data.

Father, we have great tributes. Its majestic massif. We also use analytics. How can we work in such a way that from these simple data we can come up with a concrete solution? Why do we need technologies that allow us to not only store them (which was previously impossible), but also analyze them.

Simply put, if you have a lot of data, you will need technologies, such as Hadoop, that allow you to save all the information in its original form for further analysis. Such technologies have emerged from the Internet giants, and they themselves were the first to face the problem of saving a large amount of data and analysis for further monetization.

A combination of tools for optimized and cheap data storage, required analytical tools, as well as support for the chosen platform. For example, a whole ecosystem of related projects and technologies has already developed around Hadoop. Axis actions from them:

  • Pig is a declarative language for data analysis.
  • Hive – data analysis using English language languages ​​close to SQL.
  • Oozie - thread works with Hadoop.
  • Hbase is a non-relational database, an analogue of Google Big Table.
  • Mahout – machine learning.
  • Sqoop - transferring data from RSDB to Hadoop and beyond.
  • Flume - transfer of HDFS logs.
  • Zookeeper, MRUnit, Avro, Giraph, Ambari, Cassandra, HCatalog, Fuse-DFS and so on.

All these tools are available to everyone free of charge, and also a set of paid extras.

In addition, the required fachivts: this is a specialist and an analyst (so called Data Scientist). A manager who knows how to analyze this is also needed for the achievement of a specific task, because in itself it is absolutely useless, since it is not required for a business process.

All three military specialists are expected to work for the team. A manager who lets data scientists know this pattern must realize that it won’t be long before he finds exactly what he needs. In this case, the researcher must carefully listen to what the Data Scientist knows, the fragments of his findings turn out to be more useful and beneficial for business. Your job is to work on your business and earn your product.

Regardless of those who are not dependent on various machines and technologies, the remaining decisions are forever lost to the people. For which information needs to be visualized. There are plenty of tools for this.

The most impressive example is these geoanalytical reports. The Beeline company works extensively with various places and regions. Most often, these organizations ask for messages like “Transport interest in the singing place.”

It is clear that such a sound can extend to order structures in simple and reasonable forms. If we need a large and completely incomprehensible table (so that the information appears in the way we can remove), it is unlikely to buy such a sound - it will be absolutely unnecessary, there is no responsibility for it those knowledge that they wanted to cast away.

Therefore, no matter how good the data scientists are or how they know the patterns, you will not be able to process this data without clear visualization tools.

Dzherela danikh

The array of data is so large that it can be divided into groups.

Internal company data

If you want to reach this group with 80% of the data that you collect, you must not be victorious again. Often these data, which, in our opinion, are not needed by anyone, for example, logs. If you look at them under a different lens, you can recognize that they have inconsistent patterns.

Mentally catless dzherela

This includes data from social networks, the Internet and everything that can be penetrated without harm. Why is it mentally free? On the one hand, this data is accessible to everyone, and because you are a great company, then separating them from the size of the subscriber base of tens of thousands, hundreds of millions of clients is no longer an easy task. Therefore, there are paid services on the market for the provision of these data.

Pay dzherela

There are companies here that sell money for pennies. This could include telecoms, DMPs, Internet companies, credit history bureaus and aggregators. Russian telecom does not sell data. In one way, it is economically unviable, but in another way, it is protected by law. That’s why they don’t want to sell the results of their processing, for example, geoanalytical data.

Unlock data

The state is on the lookout for business and gives the opportunity to gain access to the data that is collected. In this regard, Russia is also moving with the times. For example, there is an Open Data Portal for the Moscow Department, where information about various objects of Moscow infrastructure is published.

For locals and guests of Moscow, the data is presented in tabular and cartographic formats, and for readers - in special machine-readable formats. For now, the project is operating in a limited mode, but is developing, which means it is also a source of data that you can use for your business assignments.

Follow-up

As it was intended, the task of Big Data is to know the pattern. Often, research that is carried out around the world can become a fulcrum for finding these or other patterns - you can isolate a specific result and try to apply logic to your goals.

Big Data is not a field where all the laws of mathematics apply. For example, “1” + “1” is not the same as “2”, but much more, so when mixing the data you can significantly enhance the effect.

Applications of products

Anyone familiar with the Spotify music selection service? It’s great because it doesn’t ask the koristuvachs what their current mood is, but calculates it on the basis of resources available to you. You always know what you need right away - jazz or important rock. This is the key difference that will ensure the safety of your wearers and differentiates them from other services.

Such products are usually called sense-products - those that sense their client.

Big Data technology is being used in cars. For example, consider Tesla – their remaining model has autopilot. The company will not build a car so that it can take the passenger wherever he or she needs to go. Without Big Data, this is impossible, since we will only ignore this data, which is absolutely impossible, since without the help of people, the car will not be able to fully adapt.

If we drive the car ourselves, then with the help of our neurons we make decisions based on many factors that we do not note. For example, we may not know why we decided not to accelerate the water on the green light, and then it turns out that the decision was correct - a car passed you at crazy speed, and you had an accident.

You can also use the Big Data in Sports. In 2002, the general manager of the Oakland Athletics baseball team, Billy Bean, decided to disrupt the paradigm of how athletes need to joke with themselves - by choosing and learning to work “behind the numbers.”

Make managers marvel at the successes of the athletes, but things would have been different - in order to take away the result, the manager has learned what combinations of athletes he needs, respecting individual characteristics. Moreover, having chosen athletes who did not show great potential, the team turned out to be so successful that they won twenty matches afterward.

Director Bennett Miller has recently made a film dedicated to this story - “The Man Who Changed Everything” starring Brad Pitt.

Big Data technology is useful for the financial sector. Many people in the world cannot independently and accurately determine whether they should give someone a loan. In order to make decisions, scoring is carried out so that there will be a consistent model from which you can understand how people will turn out their pennies. Then the scoring becomes stagnant at all stages: you can, for example, predict that at the earliest moment people will stop paying.

Great tributes allow not only to earn money, but also to protect them. Zocrema, this technology helped the Ministry of Justice to quickly spend on unemployment assistance by 10 billion euros, and after analyzing the information it became clear that 20% of the assistance paid was undeserved o.

Technology also stagnates in medicine (especially in Israel). With the help of Big Data, it is possible to carry out a significantly more accurate analysis, but at the same time develop a doctor with thirty years of experience.

Any doctor, when making a diagnosis, relies only on official evidence. If you work the machine, you will get out of thousands of such medicines and all the other stories of illness. They insure what material is used to construct the patient's cabin, what area the victims are living in, what kind of smoke there is, and so on. That's a lot of factors that doctors don't take into account.

An example of Big Data research in health care can be called Project Artemis, which was launched by the Toronto hospital. This information system, which collects and analyzes data, is often not captured in real time. The machine allows you to analyze 1260 indicators of skin health. This project focuses on the prognosis of unstable childhood and the prevention of illness in children.

Great data is starting to become popular in Russia: for example, Yandex has created the Great Data. The company together with AstraZeneca and the Russian Partnership of Clinical Oncology RUSSCO launched the RAY platform, intended for geneticists and molecular biologists. The project makes it possible to improve methods for diagnosing cancer and identifying cancer-related illnesses. The platform is more important to robots in breastfeeding 2016.

Another project of Yandex Data Factory - Sniper, was developed jointly with the Magnitogorsk Iron and Steel Works and focused on optimizing steel smelting processes using additional machine learning algorithms. It is planned that the final software product will provide the optimal amount of ferroalloys and additional materials during steel production.

Big Data is being used or can be used in absolutely all areas – right up to the point where data from mobile operators is purchased for water supply services. Zocrema, which is typical for Rome, where the sewage system is very weak, therefore, with the help of Big Data, they can predict activity in new parts of the city, which helps them avoid pipe breaks and other problems.

By the way, there are a large number of products that will be available on Big Data. They can change any area completely, like health care, or they can only modify it, like online shopping. Big Data opens up great opportunities for anyone. You just need to learn how to work with her.

The term “Big-Data” may have become increasingly unknown today, but there is still a lot of confusion about what it really means. To be fair, the concept is gradually evolving and being revised, until it is deprived of the destructive power of the rich elements of digital transformation that are ongoing, including artificial intelligence, data science and the Internet of speech. What is Big-Data technology and how is it changing our world? Let’s try to explain the essence of Big-Data technology and what it means in simple words.

Dividing growth of Big-Dati

It all started with the “vibhu” in the communication of data, which we created at the beginning of the digital age. This has a lot to do with the development of computers, the Internet and technologies that “spit” data from the world. Given by powerful forces is a new way out. Even before the era of computers and databases, we used to mine paper transaction records, client records, and archive files such as data. Computers, especially electronic spreadsheets and databases, have allowed us to easily and easily store and organize data on a large scale. Unbelievably, the information became available with just one click from the target.

Tim no less, we have walked a long way through cob tables and databases. Today, every two days, we are creating as much data as we collected from the cob up to 2000 years ago. Correctly every two days. And the obligation of the data we create will continue to grow rapidly; By 2020, available digital information will increase from approximately 5 zettabytes to 20 zettabytes.

At this time, skin activity, as we are afraid, loses its trace. We generate data immediately when we access the Internet, when we carry our smartphones equipped with a sound module, when we communicate with our friends through social media or chat, etc. Moreover, the number of machine-generated data is also growing rapidly. Data is generated and expanded as our smart home devices exchange data with one another or with their home servers. Industrial equipment in plants and factories is increasingly equipped with sensors that accumulate and transmit data.

The term “Big-Data” refers to the collection of all this data and our ability to use it to our advantage in a wide range of areas, including business.

How does Big Data technology work?

Big Data works on the principle: the more you know about a given object or phenomenon, the more reliably you will be able to reach new understandings and convey what may happen to you in the future. During the process of updating a large number of data points, interconnections arise that were previously taken, and these interconnections allow us to begin to accept more important decisions. Most often, you need to follow a secondary process that involves running models on a case-by-case basis based on the data we can collect, and then running a simulation that quickly adjusts to the significant data points and identifies those that are emerging. on our results. The process of automation processes - the Suprozni technology analist to launch Milioni Tsich Simulyatsi, Nalastechiyi VSIS YOULIVI ZMINNI DOTI, docks do not know the model - yaki to receive Virishiti, over the manning of the stroke praziyu.

Bill Gates hangs over Paper instead of one CD

Until recently, data was interconnected with electronic spreadsheets and databases - and everything was more organized and neat. All those that could not be easily organized into rows and columns were valued as being more convenient for work and were ignored. However, progress in data storage and analytics means that we can capture, save and process a large amount of different types of data. As a result, “data” today can mean anything from databases to photographs, videos, sound recordings, written texts and sensor data.

In order to understand all this information, the projects that loom at the heart of Big Data, most often rely on ultra-modern analytics from the acquired intelligence and computer skills. Nowadays, computing machines recognize that specific data - for example, the appearance of patterns and samples of natural language - we can learn from them to produce models that are richer, more sophisticated and more reliable than themselves.

How does Big-Data vikorize?

This flow of information, which is constantly increasing, about sensor data, text, voice, photo and video data, means that we can now analyze the data in ways that could not yet identify much of the risk. It is worth bringing revolutionary changes to the world of business and not just in the skin. These companies can, with extreme accuracy, convey what specific categories of clients they want to receive and what they need. Big Data also helps companies conduct their activities more efficiently.

In the business sphere, projects related to Big-Data are already helping to change our world in different ways:

  • Improving health - medicine, filled with data, is designed to analyze the large amount of medical information and images for models that can help identify illness at an early stage and develop new faces.
  • Predicting and responding to natural and man-made disasters. Sensor data can be analyzed to determine whether earthquakes may be occurring, and human behavior patterns can indicate how organizations can help those who are alive. The Big Data technology is also being developed to facilitate and stop the flow of refugees from war zones around the world.
  • Forgetting evil. Police forces are increasingly adopting data-driven surveillance strategies that incorporate their own proprietary intelligence and publicly-accessible information to more effectively leverage resources and streams. going there where it is not necessary.

Beautiful books about Big-Data technology

  • Everyone is lying. Poshukoviki, Big Data and the Internet know everything about you.
  • BIG DATA. All technology in one book.
  • Promiscuity of happiness. How Big Data and new technologies help add emotion to products and services.
  • Analytics revolution. How to improve your business in the Big Data era with the help of operational analytics.

Problems with Big Data

Big Date gives us unprecedented ideas and possibilities, and also solves the problems and nutrition that needs to be solved:

  • Confidentiality of data – Big-Data, which we generate today, contains a lot of information about our particular way of life, for which we have no right to confidentiality. More often than not, we are asked to find a balance between the number of personal data that we disclose and the responsibility to demonstrate the benefits and services based on the data base.
  • The protection of data - we believe that we are ruled by those who use our data for singing purposes, so can we trust him with the safety and security of our data?
  • Data discrimination - once all the information is out there, will it become acceptable to discriminate against people based on data from their particular lives? We are already assessing creditworthiness in order to determine who can take money, and insurance is also a significant part of the data. We are ready to realize that we will be analyzed and assessed in more detail, so that we do not make life difficult for these people, who have fewer resources and limited access to information.

The collection of these data is an important warehouse store of data and it is necessary for organizations that want to collect such data. It’s not possible to make a business out of business, and not without looking at your reputation, but also at the legal and financial side.

Marvel at the future

Our world and our lives will change at an unprecedented pace. Since Big-Data is available for everything today, just know what it will be available for tomorrow. The amount of data available to us will increase, and the technology of analytics will become even more modern.

For business purposes, the date is becoming increasingly important in the near future. Only those companies that view data as a strategic asset will survive and prosper. Those who ignore this revolution risk being left behind.



How do you like the statue? Even more additional content on my wonderful YouTube channels

Just be careful! On my YouTube you can become more reasonable... 👇