Why is Python used so widely in big data analysis despite of it being slow?

I have noticed that Python is used a lot in big data.

People call C functions from Python, then process it further in Python, then call some other libraries, possibly again in Python that also look at gigantic data arrays.

Isn't this an extremely inefficient way of doing things? Python is much slower than C++. How can it make sense to use Python in situations when large data is processed, performance-wise?

One company asked me the question "How to bind a C-function to Python that computes a 1GB floating-point array, and then to compute a total of all numbers in Python?" They ask this question from the position when they assume that the use of Python is totally normal, and one should do such things as computing a 1GB fp array in C, then copying it into a gigantic Python list, then computing a total of numbers in Python. But this question in itself assumes that things are done extremely inefficiently, isn't it? They are just indoctrinated and think that things that they do are normal when they are far from normal.

So why is Python used so widely, as opposed to using C++, for example? Is this because many people feel that Python is much easier and C++ is too hard?

#python #data-analysis #big-data #data-science #machine-learning

23.65 GEEK