Computer vision in the age of Artificial Intelligence (AI)

Computer vision refers to instances of recreating our own eyes in computers or machines

Dr Evans Sagomba
Everything AI

TODAY, we are faced with a situation where computers can suddenly see everything.

I am sure you have wondered about this.

If you have not asked yourself this question well, I have been wondering as well, and this has been going on for quite some time, so today, let me take you through how computers can see.

Let me explain it to you.

However, before I mesmerise your mind with the complexity of computer vision, I would like to give credit to where it is due, this credit goes to us humans.

As humans, we are capable of spotting a friend in a massive crowd or even finding a book on a cluttered shelf.

This is possible because our minds are incredible.

Our mind controls everything we can accomplish through our eyes.

Here is the catch, can you imagine if computers (machines) can be given the same ability to do everything we can do?

Well, here is a bit of a spoiler as you are still imagining.

Today computers/machines can do what we can do, this is possible because of innovation.

Today, as computers/machines are advancing, their advancement is permeating everything we use daily, from our smartphone’s camera to augmented reality games.

Can you picture this, machines are being taught to see, they are merging our virtual and real worlds in ways previously confined to science fiction.

Now let us learn together, what is computer vision.

Computer vision refers to instances of recreating our own eyes in computers or machines.

These eyes when combined with Artificial Intelligence’s (AI’s) unprecedented and raw computational strength and pattern recognition abilities it is transformed into a supercharged version of our eyes.

This combination allows us to use computer vision and AI to make a presentation using only our hands and nothing else.

Now imagine which other physical tasks in your life are being simplified or enhanced by AI.

Today, AI is controlling even our home’s lights to navigate a virtual environment.

How has vision evolved: From neurons to neural networks

It is already spectacular how our human visual process works.

A bit of biology, shall we?

When light enters our human eyes, it is processed through intricate neural pathways.

In that process, we can recognise our surroundings.

However, when it comes to computers and machines, the process is slightly different.

For computers and machines, the task starts with pixels from an image, and then it enters into neural networks, in this case, Convolutional Neural Networks (CNNs), these have taken computer vision to levels never seen before.

It enables the processing of videos or live camera feeds using complex AI.

This is used in self-driving cars or autonomous vehicles (AVs).

Hold on tight we are just beginning, as I continue to break down this process, you are going to appreciate the complex intelligence behind the convenience of innovations soon to surround you.

Whether shopping with augmented reality or hopping into an autonomous taxi, this neural wizardry will be at its core.

Some complex jargon as we discuss the core components of convolutional neural networks (CNNs).

What are the key components of CNNs, first, we have convolutional layers (CLs), well let me jog your memory, some of you have had the opportunity to use fancy glasses, and boom the world seemed sharper?

That’s exactly what convolutional layers are for computers.

Convolutional layers work with filters to extract distinct features from each pixel in the image.

Moving across the picture, these layers can identify edges, textures, and other patterns.

Second, we have pooling layers (PLs): Polling layers play a crucial role in downsizing the spatial dimensions of the processed image.

These work like study notes for your exams.

When exams are closer all you do is ready compacted notes instead of the whole.

Well, this is the same role that computers do.

In this process, computers only retain or focus on the most salient features.

This pooling process makes the network more efficient and reduces the computational burden.

Third is connected layers (CLs). Connected layers work in layers, in this instance, every neuron is connected to every other neuron in the subsequent layer.

Their primary role is to interpret the features extracted by the convolutional and pooling layers and make informed decisions or classifications based on them.

How do neural networks learn?

This is similar to how we study for exams and learn from our mistakes.

Neural networks continually update what they learn by modifying tiny settings called weights, here they will be trying to match their discoveries output with the correct answer.

This works through the use of backpropagation and gradient descent, by doing this computer becomes good at identifying objects in a picture.

This learning process is used in AI-driven product recommendations on streaming platforms and to develop personalised advertisements.

What does AI and computer vision do today? DeepFakes

What It Does: DeepFakes refers to using artificial intelligence to create hyper-realistic, but entirely fake content.

Initially, this technology was mainly applied to video, especially manipulating video footage of people to make them say or do things they never did.

However, the technology has since expanded to audio, images, and more.

Concerns arise when this technology is used maliciously, leading to misinformation, fraud, or personal attacks.

The most prominent players in this field are DeepDream by Google and FaceApp.

So why am I writing this?

This is critical because today with AI, we should be able to distinguish between real and fake images of videos.

With AI you will be shocked to see a video of someone doing or saying something unbelievable or an image in a place they’ve never been.

We call these Deepfakes.

This information should help you to have an idea about and protect yourself from misinformation.

The good in the medical field

Computer vision is not all about bad things.

In medical imaging, computers use AI to analyse medical images and dictate diseases or other medical problems.

A good example is Aidoc and Google’s DeepMind Health.

We should know this as it will help you understand the future of medical check-ups.

AI dictate even the smallest changes that the human eye cannot see.

Through the use of AI cancer early detection has become a reality.

Today’s self-driving cars

Now there are self-driving cars, that use AI to make decisions and help them to navigate, respond to their surroundings, and travel safely without human oversight.

AI helps self-driving cars to make split-second decisions, this helps them to avoid collisions and navigate the highways using data from sensors, cameras, and radars.

For example, we have Waymo (a subsidiary of Alphabet, Google’s parent company), Tesla’s Autopilot, and NVIDIA’s Drive platform.

What does the future hold?

I am quite sure AI will not just be used for watching, it will be used to predict things before they happen.

Some like alerting us about potential dangers.

AI will be working like a personal psychic.

But just like any technological advancement, we must be responsible for creating and using it.

As AI and computer vision evolve, the applications stretch far and wide, potentially influencing everything from entertainment choices to safety precautions.

Staying informed ensures you’re prepared and proactive in this changing landscape.

If you have specific areas that you need to be addressed in the area of Artificial intelligence (AI), contact the editors or email the author directly and the issue will be addressed in the following week’s column.

Dr Evans Sagomba is a Doctor of Philosophy and, Chartered Marketer (CMktr, FCIM) with an MPhil and PhD. He specializes in AI, Ethics, and Policy Research, and is an AI Governance and Policy Consultant. His expertise extends to Ethics of War and Peace and Political Philosophy. Contact: [email protected]. ORCID: 0009-0007-0681-0329. Social media handles; LinkedIn; @Dr. Evans Sagomba (MSc Marketing) (FCIM)(MPhil) (PhD) X: @esagomba.