I love Physics. And if only I hadn’t developed the taste to live my professional life outside of university campus labs, I would have continued on the course. At some point in my life, however, I sensed I would enjoy the lifestyle that many software developers have. Naturally, Data Science seemed like the best of the two worlds and became my new direction.
In this article, I would like to share with you some of my observations concerning how I think Physics may enrich Data Science. We will go beyond the simple “stay curious” narrative. Instead, we will focus on more subtle things and look into “thinking patterns” that distinguish it from e.g. Software Engineering. Hopefully, we will discover areas, where familiarity with Physics can bring a positive impact on a data project, as well as those places, where it can fall short.
Formally speaking, the word science denotes a systematic enterprise to build and organize knowledge, which is done through observations and testable predictions. Physics, being a part of natural sciences, is a discipline that uses scientific methods to build an understanding of the matter. Concisely, it studies the world and helps to predict its behavior.
What is Data Science then?
Data Science is a new field that tries to use scientific methods to build insights and predictions given data. Here, the word “data” can refer to some particular physical measurements, but can also describe opinions or represent any synthetic or abstract information. Because of that, its primary area of interest is very different from that of Physics. Furthermore, since data is mostly a digital record today, it makes the whole discipline tightly related to Computer Science and Software Engineering. Finally, unlike Physics, Data Science mostly operates in a business context.
Still, in contrary to computer fields, Data Science is supposed to do something about the data — preferably extracting the information and projecting it onto the future. Also, contrary to pure business, it emphasizes rationality.
It doesn’t sound too far from Physics, or does it?
Let’s begin with similarities. A common practice that exists in both disciplines is the modeling of reality. Physicists study the world by measurement and describe it using mathematical equations. Software developers map problems using abstractions and express them through code.
The abstractions themselves seem to have originated from a well-known concept in Physics, namely an isolated system. We call a system isolated when we assume a certain idealization of the world and treat it as a detached entity. This approach makes the problems easier to describe and test.
For Data Science, despite different nature of the information, we also try to create some representation of the problem we face through isolation or abstraction. It is a good start.
Now, consider the following:
“Photons are objects of a class boson and electrons are leptons. Atoms are instances of particles and use .bond() method to form molecules… Now, it is all about executing the universe.run(), correct?”
Apart from skipping like a million of abstraction layers - yes. At least, it shows the direction of thinking. Besides, isn’t modeling the whole universe the ultimate goal of Physics as a science? It is! Right?
Almost, but not exactly.
While it may be tempting to think like that, this is not what physicists would be looking for. In fact, instead of trying to compile all the fundamental building blocks, physicists would rather be concerned about the blocks themselves. Physicists know that once the bricks are understood, the ultimate building is obtained by combining, repeating, arranging and rearranging of these pieces.
Forming a higher structure is, of course, not an easy task, but computer engineers are experts in finding optimal arrangements of the bricks. It is also their primary field of interest.
The reasoning appears to be almost opposite between the two. While computer experts would try to climb the ladder of abstraction, physicists would eagerly go down to examine the seeds with a magnifying glass. Speaking of Data Science, both thinking paths are needed. To create an efficient predictive model, a data scientist must understand what data components are required, but also how they should be arranged. What features do seem to make sense? If it sounds like exploratory data analysis (EDA), physicists have been doing it all along.
Speaking of features making sense, Physics, even the most “applied” one, requires that a fair amount of insight is built. A profound understanding of the reasons for why things work (or not) is essential not only to construct a meaningful hypothesis but also to succeed in the long run.
Conversely, Software Engineering’s key focus is to deliver a business solution. It emphasizes “how” over “why”, which is both a strength and a weakness at the same time. It is a plus when it comes to tackling complexities, optimizing the performance and guaranteeing robustness. When plunging into the data, even domain-driven-designdoes not bring us too far. It is simply another way of 3developing a product. In other words, Computer Sciences do not have a recipe for figuring out what is meaningful about the data.
Thinking of Data Science, it may come across as if extending Computer Science plus EDA might do the trick. Unfortunately, this is a shortcoming.
Physics can be theoretical, but it is still an experimental science. Be it at Hadron Collider or just Gedanken, it teaches how to conduct the process from stating a hypothesis to analyzing the results. When building data products, software farms turn into Big Data military compounds for conducting virtual experiments. Although nobody gets hurt by flying bits (and bytes), designing and carrying these controlled data explosions requires more than just setting up the necessary infrastructure and waving a “keep off” sign.
Relying solely on statistical equations does not guarantee success either. One has to think holistically of what is being investigated. Although a predictive model is the end goal, it often takes extensive experimentation before it is reached.
Here, if Data Science would ask Physics for advice, that would go in the direction of sharpening of the experimental processes. Such a process should foster generating insights, but without compromising the scrutiny. It may also encompass collateral activities such as data collection — whatever it takes to ensure one can trust the results.
So far, we have been considering only the positive sides of Physics in the context of Data Science. How about the areas, where you would not turn to Physics for inspiration?
An example would be to run a purely data-driven project without a clear delivery constraint. Such a project is a straightforward recipe for getting stuck in perpetual experimentation. While it can be fun for the team, it can be expensive for the organization. The main reason for it is the business context of Data Science, and Physics does not have a stop mechanism there.
Secondly, an intellectual understanding of the building blocks does not guarantee the solution to the bigger picture problem. Software developers are often more capable of distilling the levels of granularity and spotting implementation challenges before any development even starts. Ultimately, it is engineering work to make it “all fit together”, and knowing bricks does not yet let one build a house. Sometimes, it may be unjustified to dig too deep.
Finally, there is also an unhealthy tendency among at least some physicists to accumulate data just “in case”. Not only does it cause a real headache for the developers, but in extreme cases, the whole development may not even be able to start! Again, insight is obtained by asking the right questions — not a lot of questions.
As a discipline, Physics is much older than Computer Science and incomparably more mature than Software Engineering. When Archimedes jumped out of his tub, the world hasn’t yet heard of Algorithmi, and when Isaac Newton formulated the Laws of Dynamics, the first processor was long to be created. On the contrary, Software Engineering is only but a hundred years old, and the widespread of the Internet counts as the last thirty. Despite the challenges it faces each day, if it was Medicine, we could say we just learned to wash hands.
On the other hand, the pace at which technology advances and its omnipresence in our lives is unprecedented. Let’s take the total volume of data, which is estimated to be around 50 zettabytes today, as an example. If one byte of information weighted 1kg, the total mass of the world’s data would amount to 2/3 of the Moon already (by 2020). And only five years from now, it would exceed the Moon twice. We, who volunteered to make sense of the data, cannot afford to be lazy in learning.
Physics, being such an old discipline, did have the time to work out some proven methodologies. Sometimes I even think of Data Science as virtual Physics, although every time I do, it does feel a bit odd. Nevertheless, if Data Science were to take some advice from Physics, it would immediately go towards the efforts in experimentation and analysis.
Thinking in terms of first-principles can be a great benefit to Data Science too, provided that it does not turn into “overthinking”. Still, the first-principles thinking was what often stood behind some of the greatest discoveries that happened during history. It is useful to bear that in mind.
Finally, Data Science-related activities cannot be disengaged from the context of an organization’s business model. Unlike Physics, which is a free science, the main goal for Data Science is to help to grow business. If this is ensured, Physics may offer very complimentary views to the Data Science and indeed contribute to its success.
For this week’s data science career interview, we got in touch with Dr Suman Sanyal, Associate Professor of Computer Science and Engineering at NIIT University. In this interview, Dr Sanyal shares his insights on how universities can contribute to this highly promising sector and what aspirants can do to build a successful data science career.
With industry-linkage, technology and research-driven seamless education, NIIT University has been recognised for addressing the growing demand for data science experts worldwide with its industry-ready courses. The university has recently introduced B.Tech in Data Science course, which aims to deploy data sets models to solve real-world problems. The programme provides industry-academic synergy for the students to establish careers in data science, artificial intelligence and machine learning.
“Students with skills that are aligned to new-age technology will be of huge value. The industry today wants young, ambitious students who have the know-how on how to get things done,” Sanyal said.
#careers # #data science aspirant #data science career #data science career intervie #data science education #data science education marke #data science jobs #niit university data science
If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.
If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.
In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.
#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition
The buzz around data science has sent many youngsters and professionals on an upskill/reskilling spree. Prof. Raghunathan Rengasamy, the acting head of Robert Bosch Centre for Data Science and AI, IIT Madras, believes data science knowledge will soon become a necessity.
IIT Madras has been one of India’s prestigious universities offering numerous courses in data science, machine learning, and artificial intelligence in partnership with many edtech startups. For this week’s data science career interview, Analytics India Magazine spoke to Prof. Rengasamy to understand his views on the data science education market.
With more than 15 years of experience, Prof. Rengasamy is currently heading RBCDSAI-IIT Madras and teaching at the department of chemical engineering. He has co-authored a series of review articles on condition monitoring and fault detection and diagnosis. He has also been the recipient of the Young Engineer Award for the year 2000 by the Indian National Academy of Engineering (INAE) for outstanding engineers under the age of 32.
Of late, Rengaswamy has been working on engineering applications of artificial intelligence and computational microfluidics. His research work has also led to the formation of a startup, SysEng LLC, in the US, funded through an NSF STTR grant.
#people #data science aspirants #data science course director interview #data science courses #data science education #data science education market #data science interview
Data Science becomes an important part of today industry. It use for transforming business data into assets that help organizations improve revenue, seize business opportunities, improve customer experience, reduce costs, and more. Data science became the trending course to learn in the industries these days.
Its popularity has grown over the years, and companies have started implementing data science techniques to grow their business and increase customer satisfaction. In online Data science course you learn how Data Science deals with vast volumes of data using modern tools and techniques to find unseen patterns, derive meaningful information, and make business decisions.
Advantages of Data Science:- In today’s world, data is being generated at an alarming rate in all time lots of data is generated; from the users of social networking site, or from the calls that one makes, or the data which is being generated from different business. Because of that reason the huge amount of data the value of the field of Data Science has many advantages.
Some Of The Advantages Are Mentioned Below:-
Multiple Job Options :- Because of its high demand it provides large number of career opportunities in its various fields like Data Scientist, Data Analyst, Research Analyst, Business Analyst, Analytics Manager, Big Data Engineer, etc.
Business benefits: - By Data Science Online Course you learn how data science helps organizations knowing how and when their products sell well and that’s why the products are delivered always to the right place and right time. Faster and better decisions are taken by the organization to improve efficiency and earn higher profits.
Highly Paid jobs and career opportunities: - As Data Scientist continues working in that profile and the salaries of different position are grand. According to a Dice Salary Survey, the annual average salary of a Data Scientist $106,000 per year as we consider data.
Hiring Benefits:- If you have skills then don’t worry this comparatively easier to sort data and look for best of candidates for an organization. Big Data and data mining have made processing and selection of CVs, aptitude tests and games easier for the recruitment group.
Disadvantages of Data Science: - If there are pros then cons also so here we discuss both pros and cons which make you easy to choose Data Science Course without any doubts. Let’s check some of the disadvantages of Data Science:-
Data Privacy: - As we know Data is used to increase the productivity and the revenue of industry by making game-changing business decisions. But the information or the insights obtained from the data may be misused against any organization.
Cost:- The tools used for data science and analytics can cost tons to a corporation as a number of the tools are complex and need the people to undergo a knowledge Science training to use them. Also, it’s very difficult to pick the right tools consistent with the circumstances because their selection is predicated on the proper knowledge of the tools also as their accuracy in analyzing the info and extracting information.
#data science training in noida #data science training in delhi #data science online training #data science online course #data science course #data science training
Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.
In this article, we list down 50 latest job openings in data science that opened just last week.
(The jobs are sorted according to the years of experience r
Skills Required: Real-time anomaly detection solutions, NLP, text analytics, log analysis, cloud migration, AI planning, etc.
Skills Required: Data mining experience in Python, R, H2O and/or SAS, cross-functional, highly complex data science projects, SQL or SQL-like tools, among others.
Skills Required: Data modelling, database architecture, database design, database programming such as SQL, Python, etc., forecasting algorithms, cloud platforms, designing and developing ETL and ELT processes, etc.
Skills Required: SQL and querying relational databases, statistical programming language (SAS, R, Python), data visualisation tool (Tableau, Qlikview), project management, etc.
**Location: **Bibinagar, Telangana
Skills Required: Data science frameworks Jupyter notebook, AWS Sagemaker, querying databases and using statistical computer languages: R, Python, SLQ, statistical and data mining techniques, distributed data/computing tools such as Map/Reduce, Flume, Drill, Hadoop, Hive, Spark, Gurobi, MySQL, among others.
#careers #data science #data science career #data science jobs #data science news #data scientist #data scientists #data scientists india