The cycle of smart development (Big Data, Algorithm, and Cloud Computing)
In today's smart development world there are three terms that are most important and attractive, Big data, Cloud, and Algorithm, these three terms made a unique cycle for development, major successful applications of today's smart world like Facebook, LinkedIn, and Amazon have relied on these special technologies when someone refereeing data it means he complement the big data and if he refers the cloud means the computing power he needs and if he refers to algorithm means the performance of the system, all three maintain a development cycle, big data needs more computing power and depends on the performance of the algorithm
Big data, algorithm, and cloud computing power steadily penetrated in our daily life the fields of automotive, education, medical, financial, cultural and entertainment industries all of them depend on the triangle of these fields, without this triangle any industry never catch the growth and this triangle become the standard of the intelligence in our smart world
Data / Big Data
Historical data is a critical part of technology without this no one ever achieves the result, machine learning, deep learning, and artificial intelligence are dependent on data, the big data needs to be cleaned and noise-free in order to become part of the technology, data is generated from many fields like e-commerce, finance, healthcare, transportation and others related to our daily life
After accumulating the data from portals, developers can actually start to build modeling of the new value generated system, the better is data organized and structured the better is the result in subsequent steps of building the system
If we consider the field of e-commerce, the data sets related to this application are helpful in click rate prediction, cloth matching, and in the case of weather application, the prediction of rainfall is dependent on the data that focus on the livelihood of the population and in transportation which is vastly benefited from big data due to population interest on flight scheduling and route planning.
The problem in the development cycle is initially the data sets that are not available, and if the data set is found then maybe not in a structured format and never matched with the standard, data sets need much time to clean and transform into valid data set.
The second step of modelling is an algorithm that needs to better against the generalization ability of the model, many times the better performance of the algorithm is disturbed when used with different types of data sets, often time developers found a big gap when they migrate their algorithm to other types of data sets, developers face tremendous bottlenecks when they try to build the multi-task model or to build a very generalized model for the real world.
Just like any other physical product that greatly improves over time the algorithm also needs to be iterated for several simulations and need to be accurate with the context of the intelligent knowledge.
Today the available algorithms are capable of providing only 70% of intelligence from the data ramming 30% left behind the empty space, noise, and unlabeled data.
For many developers, the basic model of the algorithm is open source but the fundamental task of extracting the intelligence from the algorithm will not more open-source, the huge efforts of developers massively directed towards the competitiveness of an algorithm, richness of the algorithm, and high performance of the algorithm
Algorithm and Demand Diversity
The accuracy of the algorithm depends on the multiple tasks of the algorithm, due to diverse data sets algorithm performance may fluctuate like in the case of human face detection, the different data sets of face data disturb the actual performance of the algorithm. If you train your model with more than 10,000 samples you only get 80% accuracy which is far behind the demanded accuracy for the developer
It is also believed among developer the is it is very difficult to achieve 95% accuracy with few sample data and many times become the root cause of the false alarm rate and that model never landed on the real-world situation
Cloud Computing Power
Due to the complicated world, many algorithms become very complex and this requires strong computing power in order to execute, however, the computing power is so cheap as compared to last decade but for complex algorithms that are relatively expensive because this complicated world is more complex which we cant imagine and require a lot of processing cycles to evaluate the result from the data.
In order to build a successful model with big data the normal computing power is in the range from 0.5T to 2T, it depends on your model that how much computing power you model need, some time due to less complex requirement your model does not need maximum computing power, and it is waste of resources if you put maximum computing power for fewer complex scenarios.