Open Source Intelligence for Cyber Offense and Defense
[Slide 1]
Hi, all. In this video, I explore the use of open source intelligence by attackers, how to mitigate the associated risk, and I demonstrate how to use three popular open source collection tools.
[Slide 2]
Open Source Intelligence, or OSINT, has been an intelligence tool for both enemies and friends for many years. The internet now provides even more opportunities to collect information about individuals, organizations, and governments that can be used in malicious ways. Consequently, organizations must understand their OSINT footprint and how to manage associated risk. In addition to providing information on how to do this, I include at the end of the article videos showing how three OSINT collection tools work.
[Slide 3]
First, some terms. OS Data is raw information available from a variety of internet and media resources. We create OS information by extracting and making sense of OS Data. OS intelligence, or OSINT, is the development, analysis, and application of OS information collected specifically to achieve some result.
[Slide 4]
OSINT is not new. Its structured use began at least as early as 1941 when the United States Office of Special Services, or the OSS, created an office dedicated to collecting, correlating, and analyzing information. Resources included print and broadcast media. Today, OS data resources also include information readily available on the internet.
[Slide 5]
When creating OSINT, we use two basic approaches. In the first, we use traditional search engines to search for and gather information. We then manually look for relationships to create OSINT.
The second approach adds to the collection of information by providing automation to the correlation of the collected data. This quickly provides users of OSINT, both white hat and black hat, to create information ready for analysis and use.
I demonstrate both approaches later.
[Slide 6]
The internet is the primary tool used today for managing and operating organizations. This falls well into what intelligence agencies call C4I: command, control, communications, computers, and intelligence. Understanding what information about our organization and its information resources, including users, is available on the internet is an important part of good threat intelligence, vulnerability management, and overall risk management.
[Slide 7]
Web and internet OS data sources include
Blogs
Discussion Groups
Any user created content
Online publications
Social networking sites
Database services
o Factiva
o Lexis-Nexis
o Dialog
Information stored on internet facing devices
[Slide 8]
Before going further, it is important to understand the difference between the web and the internet. The internet is the infrastructure upon which the web exists. Many different types of services run across the internet that provides email, file transfer, remote access, facility to facility connectivity, and other services.
The web is a part of the internet, usually accessed with browsers and other software that use HTTP or HTTPS. However, with the right tools, anyone can see and collect information not included in browsers but located where anyone on the internet can access it.
[Slide 9]
In addition to the internet, other sources of OS data, found on or off the internet, include,
Public government data
Commercial and professional publications
Imagery
Financial and industrial analyses
[Slide 10]
A set of OS data sources many organizations might not think of is known as gray literature. It includes written information not intended for public use. It is, however, often shared. Sharing can easily result in the documents ending up where easily found by internet searches, or searches through the trash. Gray literature includes,
Technical reports
Preprints
Patents
Working papers
Unpublished works
Newsletters
Business proposals
Requests for proposal
[Slide 11]
When an attacker collects information as part of the reconnaissance phase of an attack, she includes searches for
Information about our information resources, including network information and what database information might be available. Also searchable are cloud control panels and other internet facing management resources with default or no passwords.
Documents, papers, presentations, and other material never meant for public release. If these files exist on an internet connected device, especially one scanned by common search engines, they are readily available for inclusion in OSINT.
Employee social network content. Even if employees do not post sensitive information, they post enough about themselves to enable some of these attacks.
[Slide 12]
Cybercriminals and government actors use our personal and organizational information in various ways. From a network attack perspective,
Social engineering attacks grow stronger as the attacker is able to gather more information about the targeted users.
Guessing passwords is easier when attackers understand user interests and environments, including family names. Further, the use of vendor default passwords can be identified and leveraged.
By understanding an organizations internet facing internet infrastructure, denial of service attacks may become easier or reveal opportunities not already considered.
OSINT provides insight into what is running on devices, known vulnerabilities, and known exploits.
Attackers can aggregate small pieces of what you might not consider sensitive information and derive intellectual property or other very sensitive information.
OSINT is also used to determine which organizations to attack. Which organizations have something of high value? Is the cost of stealing those high-value items worth the return? What is the best attack vector for a specific organization? Who are the best targets for targeted phishing or other types of social engineering?
[Slide 13]
It’s clear that attackers use OSINT for reconnaissance. Consequently, organizations must understand their OSINT footprint. This requires regularly using the same tools as cybercriminals to
Identify information publicly available
Know when users are sharing sensitive information or information that can contribute to deriving sensitive information
In addition, security teams must
Control which systems are connected to the internet and regularly scanning internet connected systems for unwanted content
Develop threat intelligence processes to adjust their organizations’ OSINT footprint continuously
[Slide 14]
This slide is a summary of what weve covered so far. It represents a common approach to the development and use of OSINT.
Identify sources
Collect OS Data
Correlate OS Data to derive meaning and connections that help better understand target systems
Analyze the resulting OS Information to find the best opportunities for achieving attack objectives
Create and implement a plan
[Slide 15]
Now I will demonstrate how three of the OSINT tools work. They include Shodan, Google, and Maltego. Shodan and Google collect data but do not provide correlation. Maltego helps to correlate collected data.
[Slide 16]
Shodan is commonly used by attackers and researchers. It is one of the best ways to determine the extent of exposure to an emerging threat across global networks. Using for simple searches is free. A $49 perpetual license enables both reporting and integration into your solutions. Shodan relies on banner information. If an organization closely manages banner displays on its devices, Shodan will have limited success in collected OS data.
Lets see how this works.
[Shodan video]
No Transcript
[Google Exploit Search Demo]
Now let’s take a look at one of the search giants for OS data collection: Google Search. When the vast majority of us use Google, we enter simple word or phrase searches. However, this is not where the real treasures lie.
Google Dorks are search strings that use Google search operators to look for specific things in specific locations. Johnny Long began several years ago to put together a database of useful dorks searchable by what it is a person is looking for. This is now the Google Hacking Database. It is available at the link shown (https://www.exploit-db.com/google-hacking-database).
The best way to learn how to use dorks is just to explore. However, I’ll show you an example of what you might find about your organization
In this example, I typed “password in the Quick Search field. A large list of does that search for the word “password” in page or document titles of specific types. You can also focus on specific URLs. URL focus is useful in searches checking your organization’s exposure.
Scrolling down, I chose the highlighted dork. It searches for the “password” and “Login Info” in text files. No URL “is provided. However, it’s easy to add the “inurl operator to any dork.
I click on the link, that takes me to the dorks general information page.
Moving to the upper right, I click on the link provided. This performs the actual search on Google. I don’t have to go far to find something interesting. Clicking on Password Hint,
I get access to login information in a text file created in January. Again, the best way to understand what you can find using Google for OS Data collection is to play around with it. Google dorks are a great way to start understanding your organizations OSINT footprint.
[Slide 17]
Maltego goes further than Google or Shodan. In addition to scanning all internet resources, it identifies relationships useful for attackers. The community edition is free with Kali Linux and is a useful tool for security professionals trying to develop a deeper understanding of their organizations footprints.
For a Maltego Community Edition demo, I couldnt do any better than Joshua James at DFIR.Science. Here is the link. (https://youtu.be/-4ell2N3kj4). Best viewed on your computer screen.
[Slide 18]
Thats it for now. For more videos, whitepapers, and daily threat intel posts, please follow my blog on IT.Toolbox.com or with my RSS feed.
and Until next time, be careful what you click.