Information Safety

Improving technology through lessons from safety.

Does Phishing Training Work?

I was recently talking with a couple of friends who both work in technology outside of cybersecurity, and our conversation led to one of the most common interactions with the security team: phishing training. Their general experience reflected my own: companies generate simulated phishing emails, send them out periodically, and deliver training to employees that click.

This raises an important question, Does phishing training work? In the spirit of the Safety of Work podcast, let’s examine the evidence.

Academic Literature

What does the academic literature have to say? A quick search in Google Scholar found a literature review paper on the topic from 2020, “Don’t click: towards an effective anti-phishing training. A comparative literature review.” The paper reviewed 104 papers from 2003-2020 that researched questions on phishing attack success rates and/or training effectiveness.

“There is a large body of publications that confirm a decreased likelihood that users will fall victim to phishing messages after educating them with general anti-phishing material or via embedded training.”

There was a consensus that embedded training (simulated phishing emails with education for people who click the link) improved outcomes through reduced click rates, although there was no consensus on how best to educate users. Training needs to be repeated periodically to remain effective, at least every 5 months, preferably quarterly.

The paper also recommends providing a mechanism to report suspicious emails, like the increasingly common “report phishing” button, and also to tailor the level of difficulty of training to the individual, something that is not done in current commercial systems.

Data

What does the data say? Outside the academic literature, a series of reports by Cyentia and Elevate Security offer additional insights.

The first report, published in 2021, had a number of interesting findings:

  • Completing training 1-3 times reduces average click rates, but performance gets progressively worse for 4 and 5 times; average training rates for 5 training sessions was higher than none at all!
  • Sending more simulation emails decrease average click rates, even at high numbers of simulations, but flattens out just below 5%

Similar results are reported in the literature review paper: security fatigue is a real problem, and one study cited found two groups that were not affected by training - the “always click (11%)” and “never click (22%)”. In a presentation at Secure360 2015, the CEO of Cofense (then PhishMe) also noted that some users (between 5-10%) would always click on the link, no matter how much training they’d received.

Importantly, the Cyentia/Elevate report also noted that 100% of organizations eventually click or are compromised - that is, no matter how much you train, someone within your organization will click the phishing link.

A second report studied the problem in greater detail, finding that:

  • Some users get many more phishing emails than others (100s per year vs. a few).
  • The more emails a department gets the better they are at blocking them.
  • Most users won’t click the emails that do make it to their inboxes.
  • But some of those who do will click a lot (as much as one click per week).
  • Subjecting all users to the same level/type of treatment is counterproductive.

What the analysis showed was that nearly 80% of users never click a phishing link, and 4% account for 80% of clicks - a small number of high-risk users are the biggest source of phishing clicks.

(A third report studied the question of high-risk users in greater detail.)

My Experience

My own experience, reflected in my recent discussion, is that nearly everyone uses embedded training and has a “report phishing” button or a reporting email address, but does little beyond that. At worst, it is a game played by cybersecurity team to see how many people they can fool, using email templates with substantial inside knowledge (and not a realistic example of what they are likely to receive). At best, it uses realistic examples, periodic training, and provides feedback, with the goal of 0% click rate. Feedback when reporting phishing is uncommon, for either simulated or real phishing emails.

Education and training typically focuses on identifiers of phishing emails, like unusual links, email addresses, a sense of urgency, and spelling/grammatical errors. (Interestingly, the literature review paper suggests that looking for spelling errors is not an effective approach)

Training Differently

So, does phishing training work? Yes, with significant limitations. It’s clear that the embedded training and reporting button used by most companies helps improve average employee performance, but there is a small group (4-5%) of users that will always fall victim to phishing.

How can we improve? To answer that, I suggest we look at the problem through the lens of Security Differently, and shift the focus from preventing negative outcomes (clicking the link) to promoting positive capacities (reporting and blocking phishing).

If we can never get to a 0% click rate, what should be the goal of phishing training? We want to encourage employees to report suspicious emails (a measurable, positive action) so that we can take action and proactively block those emails and links. This is an important difference, illustrated well by this article, which shares the story of “Vicky”, who correctly identifies a phishing email but doesn’t report it, because “it’s not my thing to deal with.”

Focusing only on awareness, as most phishing training does, is only one part of what leads to the desired behavior (reporting) - the Capability. We also need to provide the Opportunity by making it as easy as possible to report by adding a reporting button, and Motivation through feedback - thanking people for reporting, and by following up if the email is not reported, even if the person didn’t click the link.

Even with high rates of reporting, we still have an organizational challenge: someone, somewhere, will click the link. To solve this, we can leverage the fact that most phishing emails will be sent to multiple people within our organization (or elsewhere). If we automatically block emails and links based on early reporting, we can stop the first click before it happens, and effective.

Ironically, this approach was the core content of the Cofense presentation at Secure360 in 2015: they created a system where employees were assigned a “credit rating” based on the reliability of their reports, as judged by security analysts. The system was designed to automatically block copies of emails that passed a threshold of reliability and number of reports, without human intervention of the security team.

Unfortunately, I’ve never come across an organization that fully adopted this model. If your organization has, please get in touch, I’d love to hear from you about the real-world effectiveness of this approach!

comment

SIRAcon 2023

As I’ve been posting, I’m cataloging and posting my past presentations, and this is the last one! This talk from SIRAcon 2023 summarizes my experiences leading Site Reliability Engineering (SRE), “Measuring and Communicating Availability Risk”.

The particular focus of my SRE work was on measuring availability and availability risk, and I learned quite a bit over the 3 years or so I did SRE. One of the key lessons was that the value of measuring availability using Service Level Objectives (SLOs) was for decision support (SIRA helped with this framing). That is, SLOs and the associated measurements help make decisions about what to do, either during an incident, tactically over the course of a month, and strategically over the course of several months and into the future.

Our biggest success was the result of measuring availability in ways that supported all three timescales, using a explicitly defined customer-focused measure of “available”, we were able to construct visualizations that helped during incidents (real-time), during maintenance planning (one month), and for longer-term work (many months).

A key element of this success was the business imperative: the work supported a large and important client, who had just negotiated a significant increase in availability by no longer allowing us to count scheduled downtime against our availability target. The Service Level Indictor (SLI) we created helped our incident responders understand outages better, and the SLO we created allowed our teams to schedule maintenance with confidence or confidently defer it. A hidden benefit was that the metrics, being based on direct observations from our monitoring tools, brought together and aligned the different stakeholders on a common view of how available our systems were - the new approach we developed was even adopted by our client as an improvement.

A copy of my slides are here, and the visual notes from the talk are below! As a bonus, you’ll get to see a photo of my dog, Gertie, which was added at the last minute as part of an ongoing cats vs dogs competition at the conference.

visual notes

comment

SIRAcon 2022

As I posted yesterday, I’m working through cataloging my past presentations, and I’m nearly done! Today I’m sharing a talk from SIRAcon 2022 that’s much different than what I typically do, “Making R Work for You (With Automation)”.

Many SIRA members do data analysis as part of their work, and talk about the results of their analysis at SIRAcon. However, we don’t often talk about the mechanics of our craft, how we go about doing data analysis. However, in 2019, Elliot Murphy gave a talk about just that, by showing how to use Jupyter (Python) and R Notebooks for data analysis. His presentation inspired me to start working with R using R Notebooks, and I wanted to share what I’d learned, and built to automate my workflow.

I think the talk went reasonably well, although it was hard to say for sure, as the conference was once again virtual that year. Unfortunately, some of the key attendees weren’t able to attend, and I didn’t get their feedback - although later one of them did watch the replay and shared that what I did was similar to his approach.

Aside from learning how to write better R code, I learned a couple of things from the experience (both doing it and talking about it):

  • Doing something brings deeper knowledge than reading about something. One of my goals with R was to learn good software engineering practices (documentation, testing, source code control, etc.) including DevOps practices (continuous integration and continuos delivery, CI/CD). While my experience was limited mainly to myself, I did come away with a better and deeper understanding of what it’s like to write modern software.
  • If writing software was more physically demanding, we’d probably do a better job creating tools and automation to help with the writing. As I noted in my talk, the carpenters who worked on our house spent a whole day setting up their environment to make it easier to move materials they were removing to the dumpster, and didn’t try to just brute-force the work. Experience and the challenge of physical labor led them to an economy of movement.

A copy of my slides are here, and the visual notes from the talk are below!

visual notes

comment