Best Practices in Python and why Python is so popular

Python is a versatile language that has attracted a broad base of people in recent times. Python has become one of the most popular programming languages.  The popularity of Python grew exponentially during the last decade. According to an estimate, the previous five years saw more Python developers than the conventional Java/C++ programmers. Now the question is why is Python so popular? The primary reasons for this are its simplicity, speed, and performance.

Why does Python have an edge over the other programming languages? Let’s find out!

  • Everything is an object in Python
  • Support for Object-Oriented Programming – including multiple inheritances, instance methods, and class methods
  • Attribute access customization
  • List, dictionary, and set comprehensions
  • Generators expressions and generator functions (lazy iteration)
  • Standard library support of queues, fixed precisions decimals, rational numbers.
  • Wide-ranging standard library including OS access, Internet access, cryptography, and much more.
  • Strict nested scoping rules
  • Support for modules and packages
  • Python is used in the data science field
  • Python is used in machine learning and deep learning
  • Parallel Programming

As a Python developer, you must know some basic techniques and practices which could help you by providing a free-flowing work environment. Some of the best practices in Python are listed below.

Create Readable Documentation

In python, the best practice is readable documentation. You may find it a little burdensome, but it creates a clean code. For this purpose, you can use Markdown, reStructuredText, Sphinx, or docstrings. reStructuredText and Markdown are markup languages with plain text formatting syntax to make it easy to markup text and convert it into a format like HTML or PDF. Sphinx is a tool to create intelligent and beautiful documentation easily, while reStructuredText lets you create in-line documentation. It also enables you to export documentation in formats like HTML.

Follow Style Guidelines

Python follows a system of community-generated proposals known as Python Enhancement Proposals(abbreviated as PEPs) which attempt to provide the basic set of guidelines and standards for a wide variety of topics for proper Python Development. One of the most widely referenced PEPs ever created is PEP8, which is also termed as the “Python community Bible” for properly styling your code.

Immediately Correct your Code

When creating a python application, it is almost always more beneficial in the long-term to acknowledge quickly and repair broken code. (Join the Xaltius Academy to learn how!)

Give Preferences to PyPI over manual Coding

The above will help in obtaining a clean and elegant code. However, one of the best tools to improve your use of Python is the huge module repository namely The Python Package Index (short for PyPI). Not considering the level and experience of the Python Developer, this repository will be very beneficial for you. Most projects will initially begin by utilizing existing projects on PyPI. The PyPI has over 10,000 projects at the time of writing. There’s undoubtedly some code that will fulfill your project needs.

Watch out for Exceptions

The developer should watch out for exceptions. They creep in from anywhere and are difficult to debug.

Example: One of the most annoying is the KeyError exception. To handle this, a programmer must first check whether or not a key exists in the dictionary.

Write Modular and non-repetitive Code

A class/function should be defined if some operation is required to be performed multiple times. This will shorten your code, also increasing code readability and reducing debugging time.

Use the right data structures

The benefits of different data structures are very well known. This will result in higher working speed, storage space reduction, and higher code efficiency.

These are the good practices in Python that every Python developer must follow for a smooth experience in Python. Python is a growing language and its increased use in the field of Data Analytics and Machine Learning has proved to be very useful for the developers. Python for AI has also gained popularity in recent years. In the upcoming years, Python shall have a very bright future, and the programmers who are proficient in Python will have an advantage.

Renowned Data Science Personalities

With the advancement of big data and artificial intelligence, the need for its efficient and ethical usage also grew. Prior to the AI boom, the main focus of companies was to find solutions for data storage and management. With the advancement of various frameworks, the focus has shifted to data processing and analytics which require knowledge of programming, mathematics, and statistics. In more popular terms, this process today is known as Data Science. Few names stand out and have a separate base of importance when the name data science comes into the picture, largely due to their contributions to this field and who have devoted their life and study to reinvent the wheel. Let’s talk about some of the best data scientists in the world.


Andrew Ng

Andrew Ng is one of the most prominent names among leaders in the fields of AI and Data Science. He is counted among the best machine learning and artificial intelligence experts in the world.  He is an adjunct professor at Stanford University and also the co-founder of Coursera. Formerly, he was the head of the AI unit in Baidu. He is also an enthusiast researcher, having authored and co-authored around 100 research papers on machine learning, AI, deep learning, robotics, and many more relevant fields. He is highly appreciated in the group of new practitioners and researchers in the field of data science. He has also worked in close collaboration with Google on their Google Brain project. He is the most popular data scientist with a vast number of followers on social media and other channels.

DJ Patil

The Data Science Man, DJ Patil, needs no introduction. He is one of the most famous data scientists in the world. He is one of the influencing personalities, not just in Data Science but around the world in general. He was the co-coiner of the term Data Science. He was the former Chief Data Scientist at the White House. He was also honored by being the former Head of Data Products, Chief Scientist, and Chief Security Officer at LinkedIn. He was the former Director of Strategy, Analytics, and Product / Distinguished Research Scientist at eBay Inc. The list just goes on.

DJ Patil is inarguably one of the top data scientists around the world. He received his PhD in Applied Mathematics from the ‘University of Maryland College Park’.

Kirk Borne

Kirk Borne has been the chief data scientist and the leading executive advisor at Booz Allen Hamilton since 2015. Working as a former NASA astrophysicist, he was part of many major projects. At the time of crisis, he was also called upon by the former President of the US to analyze data post the 9/11 attack on the WTC in an attempt to prevent further attacks. He is one of the top data scientists to follow with over 250K followers on Twitter.

Geoffrey Hinton

He is known for his astonishing work on Artificial Neural Networks. Geoffrey was the brain behind the ‘Backpropagation’ algorithm which is used to train deep neural networks. Currently, he leads the AI team at Google and simultaneously finds time for the ‘Computer Science’ department at the ‘University of Toronto’. His research group has done some overwhelming work for the resurgence of neural networks and deep learning.

Geoff coined the term ‘Dark Knowledge’.

Yoshua Bengio

Having worked with AT&T & MIT as a machine learning expert, Yoshua holds a Ph.D. in Computer Science from McGill University, Montreal. He is currently the Head of the Montreal Institute for Learning Algorithms (MILA) and also has been a professor at Université de Montréal for the past 24yrs.

Yann LeCun

Director of AI Research at Facebook, Yann has 14 registered US patents. He is also the founding director of NYU Center for Data Science. Yann has a PhD in Computer Science from Pierre and Marie Curie University. He’s also a professor of Computer Science, Neural Science and the Founding Director of the Data Science Center at New York University.

Peter Norvig

Peter Norvig is a co-Author of ‘Artificial Intelligence: A Modern Approach’ and ‘Paradigms of AI Programming: Case Studies in Common Lisp’, some insightful books for programming and artificial intelligence. Peter has close to 45 publications under his name. Currently the ‘Engineering Director’ at ‘Google’, he has worked on various roles in Computational Sciences at NASA for three years. Peter received his Ph.D. from the ‘University of California’ in ‘Computer Science.’

Alex “Sandy” Pentland

Named the ‘World’s Most Powerful Data Scientist’ by Forbes, Alex has been a professor at MIT for the past 31 years. He has also been a chief advisor at Nissan and Telefonica. Alex has co-founded many companies over the years some of which include Home, Sense Networks, Cogito Corp, and many more. Currently, he is on the board of Directors of the UN Global Partnership for Sustainable Data Development.

These are some of the few leaders from a vast community of leaders. There are many unnamed leaders whose work is the reason why you have recommender systems, advanced neural networks, fraud detection algorithms, and many other intelligent systems that we seek help to fulfill our daily needs.

Tableau vs PowerBI: 10 Big Differences

The concept of using pictures to understand patterns in data has been around for centuries. From existing in the form of graphs and maps in the 17th century to the invention of the pie chart in the mid-1800s, the idea has been exquisite. The 19th century witnessed one of the most cited examples of data visualization when Charles Minard mapped Napoleon’s invasion of Russia. The map depicted the size of Napoleon’s army along with the path of Napoleon’s retreat from the city of Moscow – and tied that information to temperature and time scales for a more in-depth understanding of the event.

Read more about data Visualisation in our previous blog – Practices on Data Visualisation.

In the modern world, when it comes to the search for a Business Intelligence (BI) or Data Visualisation tool, we come across two front runners. They are PowerBI and Tableau. These are the top data visualization tools. Both of these products are equipped with a set of handy features like drag-and-drop, data preparation amongst many others. Although similar, each comes with its particular set of strengths and weaknesses, and hence very often articles titled Tableau vs PowerBI are encountered. The following comparisons provide insights into which data visualization tool is best for different purposes.

The tools will be compared on the following grounds:

  • Cost
  • Licensing
  • Visualization
  • Integrations
  • Implementation
  • Data Analysis
  • Functionality

Cost
Cost remains a significant parameter when these products are compared. This is because at one end PowerBI is priced around 100$ a year while Tableau can be rather expensive up to 1000$ a year. PowerBI is more affordable and economical than Tableau and is suitable for small businesses. Tableau, on the other hand, is built for data analysts and offers in-depth insight features. So, when it comes to Tableau vs PowerBI cost comparison, Tableau is a better alternative to PowerBI.

Licensing
Tableau should be the first choice in this case. To explain why Tableau over PowerBI, the final choice is considered that is, whether one wants to pay upfront cost for the software or not. If yes, then Tableau should be chosen else one should opt for PowerBI.

Visualization
When it comes to visualization features, both the products have their strengths. PowerBI can prove to be better if the desired outcome is data with better visuals. PowerBI lets you easily upload datasets. It gives a clear and elegant visualization. However, if the prime focus is visualization, Tableau leads by a fair margin. Tableau performs better with more massive datasets and gives users efficient drill-down features.

Integrations
PowerBI has API access and pre-built dashboards for speedy insights for some of the most widely used technologies and tools like Salesforce, Google Analytics, and Microsoft Products. On the contrary, Tableau has invested heavily in integrations and widely-used connections. A user can view all of the connections included right when he/she logs into the tool.

Implementation
This parameter along with maintenance is primarily dependent on factors like the size of the company, the number of users, and others. Power BI comes out to be fairly more straightforward on the grounds of implementation and requires a low level of expertise. However, Tableau, although is a little more complex, offers more variety. Tableau incorporates the use of quick-start applications for deploying small scale applications.

Data Analysis
Power BI with Excel offers speed and efficiency and establishes relationships between data sources. On the other hand, Tableau provides more extensive features and helps the user in hypothesizing data better.

Functionality
For the foreseeable future, any organization which has users spending more than an hour or two per day using their Business Intelligence tool might want to go with Tableau. Tableau offers a lot of features and minor details that are unmatched.

Feature Power BI Tableau
Date Established 2013 2003
Best Use Case Dashboards & Ad-hoc Analysis Dashboards & Ad-hoc Analysis
Best Users Average Joe/Jane Analysts
Licensing Subscription Subscription
Desktop Version Free $70/user/month
Investment Required Low High
Overall Functionality Very Good Very Good
Visualisations Good Very Good
Performance With Large Datasets Good Very Good
Support Level Low (Or through partner) High

It all depends upon who will be using these tools. Microsoft powered Power BI is built for the joint stakeholder, not necessarily for data analyticsThe interface relies on drag and drop and intuitive features to help teams develop their visualizations. It’s a great addition to any organization that needs data analysis without getting a degree in data analysis or any organization having smaller funds.

Tableau is more powerful, but the interface isn’t quite as intuitive, which makes it more challenging to use and learn. It requires some experience and practice to have control over the product. Once this is achieved, Tableau is better than PowerBI and can prove to be much more powerful for data analytics in the long run.

A Short History of Data Science

Over the past two decades, tremendous progress has been made in the field of Information & Technology. There has been an exponential growth in technology and machines. Data and Analytics have become one of the most commonly used words since the past decade. As they are interrelated, it becomes essential to know what is the relation between them and how are they evolving and reshaping businesses.

Data Science was officially accepted as a study since the year 2011; the different or related names were being used since 1962.

There are six stages in which the development of Data Science can be summarised-

Stage 1: Contemplating about the power of Data
This stage witnessed the uprising of the data warehouse where the business and transactions were centralised into a vast repository. This period was embarked at the beginning of the 1960s. In 1962, John Tukey published the article The Future of Data Analysis – a source that established a relation between statistics and data analysis. In 1974, another data enthusiast, namely Peter Naur, gained popularity for his article namely Concise Survey of Computer Methods. He further coined the term “Data Science” which came into existence as a vast field with lot many applications in the 21st century.

Stage 2: More research on the importance of data
This period was witnessed as a period where businesses started research for finding the importance of collecting vast data. In 1977, the International Association of Statistical Computing (IASC) was founded. In the same year, Tukey published his second major work – “Exploratory Data Analysis” – arguing that emphasis should be laid on using data to suggest the hypothesis for testing and simultaneous exploratory testing for confirmatory data analysis. The year 1989 saw the establishment of the first workshop on Data Discovery which was titled Knowledge Discovery in Databases(KDD) which is now more popularly known as the annual ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD).

Stage 3: Data Science gained attention
The early forms of markets began to appear during this phase. Data Science started attracting the attention of businesses. The idea of analysing data was sold and popularised. The Business Week cover story from the year 1994 which was titled ‘Database Marketing” supports this uprise. Businesses started to witness the importance of collecting and applying data for their profit. Various companies started stockpiling massive amounts of data. However, they didn’t know what and how to use it for their benefit. This led to the beginning of a new era in the history of Data Science.

The term Data Science was yet again taken in 1996 in the International Federation of Classification Societies(IFCS) in Kobe, Japan. In the same year, Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth published “From Data Mining to Knowledge Discovery in Databases”. They described Data Mining and stated “Data mining is the application of specific algorithms for extracting patterns from data.

The additional steps in the KDD process, such as data preparation, data selection, data cleaning, incorporation of appropriate prior knowledge, and proper interpretation of the results of mining, became essential to ensure that useful knowledge is derived from the data.

Stage 4: Data Science started being practised
The dawn of the 21st century saw significant developments in the history of data science. Throughout the 2000s, various academic journals began to recognise data science as an emerging discipline. Data science and big data seemed to work ideally with the developing technology. Another notable figure who contributed largely to this field is William S. Cleveland. He co-edited Tukey’s collected works, developed valuable statistical methods, and published the paper “Data Science: An Action Plan for Expanding the Technical Areas of the field of Statistics”.

Cleveland put forward his notion that data science was an independent discipline and named six areas where data scientists should be educated namely multidisciplinary investigations, models and methods of data, computing with data, pedagogy, tool evaluation, and theory.

Stage 5: A New Era of Data Science
Till now, the world has seen enough of the advantages of analysing data. The term data scientist is attributed to Jeff Hammerbacher and DJ Patil as they carefully chose the word. A buzzword was born. The term “data science” wasn’t prevalent yet, but was made incredibly useful and significantly developed. In 2013, IBM shared the statistics that 90% of the world’s data has been created in the last two years alone. By this time, companies had also begun to view data as a commodity upon which they could capitalise. The importance of transforming large clusters of data into usable information and finding usable patterns gained emphasis.

Stage 6: Data Science in Demand
The major tech giants saw significant developments in demand for their products after applying data science. Apple laid out a statement for increased sales giving credit to BigData, and Data Mining. Amazon said that it sold more Kindle online books than ever. Companies like Google, Microsoft used deep Learning for speech and Voice Recognition. Using AI techniques, the usage of data was further enhanced. Data became so precious; companies started collecting all kinds of data from all sorts of sources.

Putting it all together, data science didn’t have a very prestigious beginning and was ignored by the researchers, but once its importance was adequately understood by the researchers and the businessmen, it helped them gain a large amount of profit.

Ethical issues in Artificial Intelligence – Problems and Promises

With the growth of Artificial Intelligence (AI) in the 21st century, the ethical issues with AI grow in importance along with the growth in the technology. Typically, ethics in AI is divided into Robo-ethics and Machine-ethics. Robo-ethics is a concern with the moral behaviour of humans as they design and construct artificially intelligent beings, while Machine-ethics relates to the ethical conduct of artificial moral agents (AMAs). In the modern world today, the countries are stockpiling weapons, artificially intelligent robots and other AI driven machines. So, analysing risks of artificial intelligence like whether it will overtake the major jobs and how can its uncontrolled and unethical usage can affect the humanity also becomes important. And to prevent humanity from the ill-effects and risks of artificial intelligence, these ethics were coined.

AI and robotics are unarguably one of the major topics in the field of artificial intelligence technology. Robot Ethics or more popularly known as roboethics is the morality of how humans interact, design, construct, use, and treat robots. It considers how artificially intelligent beings (AIs) may be used to harm humans and how they may be used to benefit humans. It emphasizes the fact that machines with artificial intelligence should prioritize human safety above everything else and keeping human morality in perspective.

Can AI be a threat to human dignity?

It was the first time in 1976 when a voice was raised against the potential ill-effects of an artificially developed being. Joseph Weizenbaum argued that AI should not be used to replace people in position that require respect and care, such as:

  • A customer service representative
  • A therapist
  • A soldier
  • A Police Officer
  • A Judge

Weizenbaum explains that we require authentic feelings of empathy from people in these positions. If machines replace them, they will feel alienated, devalued, and frustrated. However, there are voices in support of AI when it comes to the matter of partiality, as a machine would be impartial and fair.

Biases in AI System

The most widespread use of AI in today’s world is in the field of voice and facial recognition and thus AI bias cases are also increasing.  Among many systems, some of them have real business implications and directly impact other people. A biased training set will result in a biased predictor. Bias can always creep into algorithms in many ways and it poses one of the biggest threats in AI. As a result, large companies such as IBM, Google, etc. have started researching and addressing bias.

Weaponization of Artificial Intelligence

As questioned in 1976 by Weizenbaum for not providing arms to robots, there stemmed disputes regarding the fact whether robots should be given some degree of autonomous functions.

There has been a recent outcry about the engineering of artificial intelligence weapons that have included ideas of a robot takeover of humanity. In the near future of AI, these AI weapons present a type of danger far different from that of human-controlled weapons. Powerful nations have begun to fund programs to develop AI weapons.

If any major military power pushes ahead with the AI weapon development, a global arms race is virtually inevitable, and the endpoint of this technological trajectory is obvious: autonomous weapons will become the Kalashnikovs of tomorrow“, are the words of a petition signed by Skype co-founder Jaan Tallinn, and many MIT professors as additional supporters against AI Weaponry.

Machine Ethics or Machine Morality is the field of research concerned with designing of Artificial Moral Agents (AMAs), robots and artificially intelligent beings that are made to behave morally or as though moral. The sci-fi director Isaac Asimov considered the issue in the 1950s in his famous movie – I-Robot. It was here that he proposed his three fundamental laws of machine ethics. His work also suggests that no set of fixed laws can sufficiently anticipate all possible circumstances. In 2009, during an experiment at the Laboratory of Intelligent Systems in the Polytechnique Fédérale of Lausanne, Switzerland, robots that were programmed to cooperate eventually learned to lie to each other in an attempt to hoard the beneficial resource.

Concluding, Artificial Intelligence is a necessary evil. Artificial Intelligence-based beings (friendly AIs) can be a gigantic leap for humans in technological development. It comes with a set of miraculous advantages. However, if fallen into the wrong hands, the destruction can be unimaginable and unstoppable.  As quoted by Claude Shannon, “I visualize a time when we will be to robots what dogs are to humans, and I’m rooting for the machines.”Thus ethics in the age of artificial intelligence is supremely important.

Everything you need to know about Automated Machine Learning

What is Automated Machine Learning?

It is the term used for the technology automating the end-to-end process of applying machine learning to real-world problems. A typical machine learning problem requires a dataset that consists of some input data on which a training model is needed to be built. The input data may not be in such a form that all machine learning algorithms may be applied to it. An ML expert needs to implement the appropriate procedures (including data pre-processing steps, feature scaling, feature extraction), resulting in a dataset suitable for machine learning. Building the model involves the selection of the best algorithm for maximizing performance from the dataset. Many of these steps are often beyond the abilities of non-experts. Considering this in mind, AutoML was proposed as an Artificial Intelligence-based solution to the gruesome challenge of applying machine learning. AutoML in machine learning using python and r thus started gaining popularity. (Read – AI and ML. Are they one and the same?)

What is the Need for AutoML?

The idea of AutoML took off with the development in the field of Artificial Intelligence. It all took shape when Jeff Dean, Google’s Head of AI, suggested that “100x computational power could replace the need for machine learning expertise”. This raised several questions:

Do hundreds of thousands of developers need to “design new neural nets for their particular needs,” or is there an effective way for Neural Networks to generalize similar problems? Or can a large amount of computation power replace machine learning expertise?

Clearly, the answer is NO. Many factors support the idea of AutoML:

  • Shortage of machine learning expertise
  • Machine-Learning expertise is cost-inefficient

For large organizations requiring high efficiency, AutoML cannot replace a machine learning expert, but it can be cost-effective and can be useful for smaller organizations.

Applications of AutoML

AutoML can be used for the following tasks using AutoML platforms like Google cloud AutoML:

  • Automated Data Preparation
    It Involves column type detection, intent detection, and automated task detection within the dataset.
  • Feature Engineering
    It includes Feature Scaling, meta-learning, and feature selection.
  • Automated Model Selection
    AutoML can help in model selection.
  • Automated problem checking
    Problem checking and debugging can be automated.
  • Automated analysis of results obtained
    Applying wonders of AI can save time and capital.

Here is a good read – Two Real Life Examples of Google’s Automated Machine Learning.

Popular AutoML Libraries like Featuretools, Auto-sklearn, MLBox, TPOT, H2O, Auto-Keras are the ones contributing to enhanced AutoML experience.

Advantages of AutoML

  • The installation of the libraries is effortless.
  • The introduction of Cloud AutoML has speeded up the development of AutoML.
  • Cost-effective, and Labour-efficient.
  • Require a lower level of expertise.

Limitations of AutoML

Although coming with a set of advantages, advanced AutoML introduces the concept of hyperparameters, which are itself needed to be learnt. AutoML can be usefully incorporated for doing a task that can be generalized, but for functions that are unique and require some level of expertise, AutoML turns out to be a disaster.

Future of AutoML

Automated Machine Learning (AutoML) has been gaining traction within the Data Science community. This surge of interest is reflected in the development and release of numerous open-source Automated Machine Learning tools and libraries, which are mentioned above, and on the emergence of businesses focused on building and commercializing AutoML systems (like DataRobot, DarwinAI, H2O.ai, OneClick.ai). AutoML is a hot topic for the industry, but it is not all-set for replacing data scientists from existence. Besides the difficulty of automating many of the data science tasks, its sole purpose is to assist data scientists and free them from the burden of repetitive, and less demanding jobs that can be generalized, so they can invest their time on tasks that are more challenging, creative, and harder to automate. (AutoML: The Next Wave of Machine Learning)

Concluding, we live in an era where the growth of data beats our ability to make sense of it. AutoML is an exciting technological field that has been in the spotlight and which promises to mitigate this problem through the development in the sector of Artificial Intelligence.

We expect significant strides of progress in this field in the near future, and we recognize the help of AutoML systems in solving many of the challenges that we face out there.