AI Beginner’s page - aipathfinder.org - Pathfinder for Artificial Intelligence

Common AI questions and answers

What is AI?

Artificial intelligence is the code that tries to replicate the functioning of the human brain running on a computing device. It tries to mimic human behaviour by replicating complex human decision making and human tasks independently. Artificial intelligence may also be thought of as the simulation of human intelligence processes using “analytical models,” running on computer systems. Specific applications of AI include expert systems, natural language processing, speech recognition and machine vision.

AI works by enabling computers to perceive the environment, learn, reason, plan, abstract and act on data (machine learning). ‘Machine learning’ and ‘Deep learning’ are techniques to achieve AI.

An important difference between software and AI is that AI does not have to be re-programmed every time when changes happen in our environment. AI used data as the input for” training” instead of programming. Changed data will result in the AI output changing automatically. In normal software, this change in output will not happen and the program has to be modified manually. This is the essential difference between the two.

At present, AI performs narrowly defined tasks well. It does not possess the general and complete environmental intelligence and awareness and the conscience of a human being¹.

What is the meaning of ‘Training the AI’?

Unlike software solutions, AI is not programmed to cater for processing of each data input for each task. AI uses an algorithm to automatically ‘learn’ from past data provided to it to predict a particular output for a specific use-case for eg: predicting sales for an item (say sweaters) for a company for the coming winter season. The data required for this task would be past 3 years sales data of sweaters. The AI model (algorithm) learns to predict this output from the data provided. The process of learning is called ‘training’ the AI.

What is inference in AI?

When the trained AI model is used to do a prediction for a live or ‘production’ run, it is called an ‘inference’ run.

How is data used in AI?

Data is used to train AI models. Hence data is the fuel for training AI. According to an article published on BBC Science Focus, the model was trained using databases from the internet that included a massive 570 GB of data (approximately 300 billion words) sourced from books, Wikipedia, research articles, web texts, websites and other forms of content and writing on the net³. Training can be using images, videos and other data formats in addition to textual data. Technology companies are vying with each other to capture every bit of personal data of individuals since individuals for training their AI products and applications.

How should we use AI and what are its benefits?

We should use AI to improve our productivity, enhance our capabilities, obtain better predictions and more. We should use LLMs for summaries of documents, emails and even other content like web pages etc. Microsoft has embedded AI features in its productivity tools. These can ne very powerful in reducing personal effort required to produce presentations, papers, financial calculation sheets etc. In addition, coders and software professionals are greatly helped by AI foundation models like Chatgpt and others to write code and also for debugging purposes. This author has experienced at least a 50 % reduction in effort to develop and test code using AI LLMs.

Of what should we be careful about in AI?

We need to be very careful with our personal data when dealing with AI. Generally existent drawbacks are fake news, manipulated media, data theft, AI bias, unethical practices, data rights and privacy violations and such. Everyone should have the freedom to choose the amount of data which she is willing to share with AI – particularly her sensitive personal data. Sharing data with AI systems is a double-edged sword. Sharing greater amounts of data with AI systems could make the individual more productive and efficient at the cost of significant drop in personal security and safety should the data fall into the wrong hands or be sold.

What are the common training models for AI?

The most common models are Supervised learning, Unsupervised learning, Reinforcement learning and Deep learning.

In Supervised learning, machines are given training data categorized as input variables and output variables from which to learn patterns and then make inferences on previously unseen data. The output variables are called ‘labels’. The mapping of labels to input data adds more work for developers.

In Unsupervised learning, the learning algorithm is given the input data only, and the algorithm identifies its own patterns in the data through learning. Its goal is to understand the underlying patterns and data structures and to improve its capabilities with additional training on more data. This type of learning is like human learning where we can observe, listen, touch, feel, smell, taste and learn based on experience and comparison of similarities.

‘Reinforcement Learning’ is a machine learning training method based on rewarding desired behaviours and/or punishing undesired ones.

‘Deep learning’ is a machine learning technique that employs algorithms called Deep Neural Networks (DNN). DNNs are a type of Artificial Neural networks (ANN) designed in the Connectionist algorithm inspired by Neuroscience.

What are the common data-based errors which can happen in AI?

Data based errors can be due to biases, where input data coverage is insufficient. As a result, some data cases may get missed altogether. Data can also be omitted intentionally by manipulators creating bias. Data security is a serious problem in the age of AI. This is because the data may flow through several different vendors for providing a service. The least secure link in the chain of vendors will determine the level of security. Violations of data privacy, data rights, data usage versus permitted, data sale for consideration, data manipulation and data misuse are all happening today. Implementation and ensuring data processing as per Data ownership in AI systems leaves much to be desired. While technology for watermarking data is being developed, the commercial interests in data are ruling the roost suppressing genuine needs for protection of individual personal data. For a more detailed understanding of data Chapter 2 “Ethics, Biases and Human-Centric AI”. ¹

In which areas are AI applications commonly deployed?

Maximum use of AI applications have been deployed in the following industry segments:

Customer analytics, Fraud detection, Risk management, Stock trading, Targeted advertising, Website personalization, Customer service, Predictive maintenance, Logistics and supply chain management, Image recognition, Speech recognition, Natural language processing, Cybersecurity, Medical Diagnostics. (from Chapter 5 -Crafting Intelligent Data Strategies¹).

What is LLM?

LLM (Large Language Model) refers to the model itself, which includes parameters and weightings (contextual understanding) and the algorithm used for NLP (natural language processing). The training data set is not strictly part of the LLM, and training can be a one-time or iterative process. ChatGPT itself was trained through a process called Reinforcement Learning from Human Feedback (RLHF), but the pre-training process is also not strictly part of the model – more so how the model was arrived at.

Am I sufficiently protected by laws from the drawbacks of AI?

No, we are not sufficiently protected by the laws which exist today. AI regulations are lagging behind AI development across the world. Governments and international agencies are struggling to understand how to regulate AI while not restricting its growth. Therefore, it would be prudent for each individual himself/herself to consider well and decide how much of his personal data he would like to share with AI.

Which are the commonly used AI applications which I can also try out?

GPT-4o – OpenAI
Stable LM 2 – Stability AI
Gemini 1.5 – Google DeepMind
Llama 3.1 – Meta AI
Mixtral 8x22B – Mistral AI
Inflection-2.5 – Inflection AI
Jamba – AI21 Labs
Command R – Cohere
Phi-3 – Microsoft
XGen-7B – Salesforce

Image AI applications:

DALL-E 3 – OpenAI.
Adobe Firefly – Adobe.
Canva Magic Design – Canva.
Midjourney – Independent.
DreamStudio – Stability AI

List of Companies producing AI agents

OpenAI – Known for developing advanced AI models like ChatGPT.
Google AI – Focuses on various AI technologies, including personal assistants.
IBM Watson – Offers AI solutions across multiple industries, including personal use.
Amazon AI – Creator of Alexa and other intelligent agents for consumer applications.
Anthropic – Develops AI systems with a focus on safety and usability, including personal assistants.
Microsoft – Provides AI tools integrated into its products, enhancing personal user experiences.
Cohere – Specializes in NLP and custom AI solutions for businesses and personal use.
Stability AI – Known for developing generative models that can be used in personal applications.

References:

1. Tampi, Rajagopal. APPLIED HUMAN-CENTRIC AI: Clarity in AI Analysis and Design (pp. 21-22).

2. What is an LLM? – Large language models explained – PC Guide

3. https://analyticsindiamag.com/ai-origins-evolution/behind-chatgpts-wisdom-300-bn-words-570-gb-data/