Artificial intelligence is a term that’s risen to become one of the most talked-about topics across many technology and business fields. Just look at LinkedIn, for example – #artificialintelligence has nearly 2.5 million followers! By comparison, #digitalforensics has only just under 6,000 followers, which says something about just how interested people are in artificial intelligence.
I think it’s important to have an honest and realistic understanding of what artificial intelligence is (and isn’t), the effects it will have on the world as it advances and how it has already transformed many of the business practices we take for granted today.
Over the next few months, I’d like to dive into the many facets of artificial intelligence that apply directly to digital forensics and investigations. While I’m looking at the subject from one perspective, many of these views can easily apply to other functions, technologies and industries. To begin the conversation, I think it’s important to look at some of the overlooked distinctions in the types of artificial intelligence to understand where we are today and where we’re headed in the future.
ARTIFICIAL INTELLIGENCE: ANI, AGI AND ASI
According to IBM, a leader in AI development, artificial intelligence “leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind.” Discussions about AI range from the futuristically mundane (self-driving cars, a reality even today) to the downright dystopian (who hasn’t seen The Matrix?). I think it’s safe to say that self-driving vehicles aren’t going to take the world over tomorrow and enslave mankind, yet the same label is applied.
There must be some distinction under the broader umbrella of artificial intelligence. This is where the terms artificial narrow intelligence (ANI), artificial general intelligence (AGI) and artificial superintelligence (ASI) come into play.
Artificial Narrow Intelligence
ANI, which also goes by the term “weak AI” is where we’re mostly at today. This form of AI is programmed to perform a specific task, and as far back as 1996 we saw this with the famous set of chess matches between Gary Kasparov and Deep Blue. Not only does ANI operate on a specific task, it also uses a specific set of data to base its decision-making on.
With the advent of the internet and so much data available so readily, ANI can foster the illusion of broader intelligence, but realistically speaking ANI lives up to its name of ‘narrow’ intelligence, what many of us today regard as machine learning. The differences between true artificial intelligence and machine learning deserve their own article (or several!).
Artificial General Intelligence
AGI, “strong AI,” moves into the realm of exhibiting the flexibility of actual human intelligence. Probably the best example of this at present is IBM’s Project Debater, which by some estimates can debate topics at the level of a high school sophomore. This kind of intelligence, which lacks what we would consider sentience, is difficult to produce in computers despite the advances made to date in processing power and speed.
ASI raises the bar another level, surpassing human intelligence. This is likely not something we’ll need to worry about until much farther into the future; I’ll potentially touch on ASI in an article down the road.
WHAT DOES ANI MEAN RIGHT NOW FOR INVESTIGATIONS?
There’s always a conversation about whether artificial intelligence will someday replace examiners, which I think is unlikely. There is simply still too much value in the human perspective and decision-making process to expect computers to take over completely given the state of the technology.
What is true, however, is that ANI has changed the face of investigations. Gone are the days of heavy manual file carving or hex review; there’s simply no need to get that technical anymore inside of every investigation. And while I’d rather not think too much about it, artificial intelligence has done wonders by limiting the amount of time examiners need to spend looking at the disturbing images and videos that make up CP/CSAM cases.
Artificial intelligence, even at the ANI level, has come a long way in its ability to automatically identify things like skin tone, body parts, drugs, weapons and other common artifacts that can lead investigators to the truth in a case.
It’s interesting, as I considered this topic, just how far technology has progressed. It’s possible to do so much more in an accelerated window of time as an examiner. I’m not a computer ‘nerd’ in the traditional sense – a fact that I’m sure many IT departments I’ve worked with can attest to – but I get genuinely excited as a forensic examiner thinking about the possibilities that exist by combining the Nuix Engine with existing artificial intelligence capabilities.
And I’m looking forward to exploring the topic of artificial intelligence, along with other investigations subjects, in the articles to come!
A recent article published in The Guardian highlighted ‘bias’ on the part of digital forensic examiners when examining seized media. In the original study, the authors found that when 53 examiners were asked to review the same piece of digital evidence, their results differed based on contextual information they were provided at the outset. Interestingly, whilst some of the ‘evidence’ for which they would base their findings was easy to find (such as in emails and chats) other ‘traces’ were not. These required deeper analysis, such as identifying the history of USB device activity.
One of the things that struck me was that the 53 examiners were all provided with a very short brief of what the case was about (intellectual property theft) and what they were tasked to find (or not find), including a copy of a spreadsheet containing the details of individuals who had been ‘leaked’ to a competitor.
This immediately reminded me of my first weeks within the police hi-tech crime unit (or computer examination unit as it was called). I vividly remember eagerly greeting the detective bringing a couple of computers in for examination into suspected fraud. I got him to fill in our submission form – some basic details about the case, main suspects, victims, date ranges, etc. I even helped him complete the section on search terms and then signed the exhibits in before cheerily telling him that I’d get back to him in the next few weeks (this was in the days before backlogs…).
As I returned from the evidence store, I was surprised to find that same detective back in the office being ‘questioned’ by my Detective Sergeant. “John,” as we will call him (because that was his name), an experienced detective with over 25 years on the job, was asking all sorts of questions about the case:
- Who were his associates?
- What other companies is he involved in?
- Does he have any financial troubles?
- Is he a gambler?
- Did you seize any other exhibits?
- Does he have a diary?
- How many properties does he own?
The list went on. In fact, it was over an hour before John felt that he had sufficient information to allow the detective to leave. Following the questioning, John took me aside and told me that whilst we used the paperwork to record basic information about the case – it was incumbent on us to find out as much information as possible to ensure that we were best placed to perform our subsequent examination.
My takeway? You can never ask too many questions – in particular, those of the ‘who, where, when’ variety.
HAS DIGITAL FORENSICS CHANGED SINCE THEN?
Given the rapid development in technology since those early days in digital forensics, you would think the way agencies perform reviews of digital evidence would have, well, kept up?
I recently watched a very interesting UK ‘fly on the wall’ TV series (Forensics:The Real CSI) that followed police as they go about their daily work (I do like a good busman’s holiday) and one episode showed a digital forensic examiner tasked to recover evidence from a seized mobile phone and laptop in relation to a serious offence.
“I’ve been provided some case-relevant keywords,” he said, “which the officer feels may be pertinent towards the case.” “Murder, kill, stab, Facebook, Twitter, Instagram, Snapchat … and for those keywords I’ve searched for, there is potentially just under 1,500 artifacts that I’ll have to start scrolling through.”
“Have I been transported back to the 90s?” I thought as I watched in (partial) disbelief and was again transported back and reminded of John’s sage advice all those years ago about asking lots of questions.
Whilst I understand that the show’s director was no doubt using the scenes to add suspense and tell the story in the most impactful way possible, there is no getting away from the fact that the digital forensic examiner was working with limited information about the case and with some terrible keywords.
Yes, they can (and no doubt did off-camera) pick up the phone to the Officer in the Case (OIC) to ask further questions … surely, the OIC is the one who will see a document or email (that perhaps hasn’t been found by keyword searching) and see a name or address within it and immediately shout “Stop! That’s important!” The OIC will recognize the suspect in a holiday photograph having a beer with another suspect who they swear blind they’ve never met.
FOCUSING ON THE RIGHT EVIDENCE
How does this all tie back into the research I mentioned at the outset? The various ‘traces of evidence’ the examiners were tasked to find were both ‘hidden in plain sight’ and required skilled forensic analysis in order to identify and interpret their meaning. If the digital forensic examiner spends most of their precious time reviewing emails and documents – in the real world – will they have the time to perform the skilled digital forensics work to build the true picture of what happened?
If the OIC is only provided with material to review based on such basic keyword analysis or a couple of paragraphs that detail a very high-level overview into the case – will the smoking gun holiday snap make it into the review set?
Expert commentary in the article suggests that “Digital forensics examiners need to acknowledge that there’s a problem and take measures to ensure they’re not exposed to irrelevant, biased information. They also need to be transparent to the courts about the limitations and the weaknesses, acknowledging that different examiners may look into the same evidence and draw different conclusions.”
A spokesperson for the National Police Chiefs’ Council is quoted saying “Digital forensics is a growing and important area of policing which is becoming increasingly more prominent as the world changes … We are always looking at how technology can add to our digital forensic capabilities and a national programme is already working on this.”
Nuix is keen to support this national program and I truly believe that our investigator-led approach to reviewing digital evidence by using Nuix Investigate is the way toward helping to put the evidence into the hands of those who are best placed to make sense of it (the easier ‘traces’ as per the study). Doing so allows the digital forensic examiners to focus on the harder ‘traces’ – such as undertaking deep-dive forensic analysis or ascertaining the provenance of relevant artifacts.
Please note. No digital forensic examiners were harmed in the writing of this blog – and I fully appreciate the hard work they do in helping to protect the public and bringing offenders to justice, often working under significant pressures and with limited resources and budgets.