Have I Been Pwned: API v3

Overview

You're reading about v3 of the API which is presently the current version and contains breaking changes over previous versions for searching breaches and pastes via email address.

Index

Overview
Breaches
Pastes
Pwned Passwords
Further reading

Authorisation

Authorisation is required for all APIs that enable searching HIBP by email address, namely retrieving all breaches for an account and retrieving all pastes for an account. An HIBP subscription key is required to make an authorised call and can be obtained on the API key page. The key is then passed in a "hibp-api-key" header:

GET https://haveibeenpwned.com/api/v3/{service}/{parameter}
hibp-api-key: [your key]

Semantic HTTP response codes are used to indicate the result of the API call:

Code	Description
401	Unauthorised — the API key provided was not valid

Specifying the API version

Version 3 of the API is consumable only by specifying the API version in the URL. In version 2, multiple different API versioning schemes were supported however the overwhelming majority of implementations chose versioning via the URL. Consequently, previous alternative versioning schemes have been discontinued for the APIs that retrieve breach or paste data via email address and require authorisation. Non-auth'd APIs such as retrieving a list of all previous breaches still support multiple versioning schemes so as not to break existing dependencies.

Versioning via the URL

This method can easily be invoked directly by requesting the URL with an appropriate user agent string.

GET https://haveibeenpwned.com/api/v3/{service}/{parameter}

Specifying the user agent

Each request to the API must be accompanied by a user agent request header. Typically this should be the name of the app consuming the service. A missing user agent will result in an HTTP 403 response. A valid request would look like:

GET https://haveibeenpwned.com/api/v3/{service}/{parameter}
user-agent: [your app name]

The user agent should accurately describe the nature of the API consumer such that it can be clearly identified in the request. Not doing so may result in the request being blocked.

Getting all breaches for an account

The most common use of the API is to return a list of all breaches a particular account has been involved in. The API takes a single parameter which is the account to be searched for. The account is not case sensitive and will be trimmed of leading or trailing white spaces. The account should always be URL encoded. This is an authenticated API and an HIBP API key must be passed with the request.

GET https://haveibeenpwned.com/api/v3/breachedaccount/{account}
hibp-api-key: [your key]

By default, only the name of the breach is returned rather than the complete breach data, thus reducing the response body size by approximately 98%. The name can then be used to either retrieve a single breach or it can be found in the list of all breaches in the system. If you'd like complete breach data returned in the API call, a non-truncated response can be specified via query string parameter:

Parameter	Example	Description
truncateResponse	?truncateResponse=false	Returns the full breach model.

Note: In version 2 of the API this behaviour was the opposite - responses were not truncated by default.

The result set can also be filtered by passing one of the following query strings:

Parameter	Example	Description
domain	?domain=adobe.com	Filters the result set to only breaches against the domain specified. It is possible that one site (and consequently domain), is compromised on multiple occasions.

Note: the public API will not return accounts from any breaches flagged as sensitive or retired. By default, the API will return breaches flagged as unverified, however these can be excluded by using the following parameter:

Parameter	Example	Description
includeUnverified	?includeUnverified=false	Returns breaches that have been flagged as "unverified". By default, both verified and unverified breaches are returned when performing a search.

Note: In version 2 of the API this behaviour was the opposite - unverified breaches were not returned by default.

Getting all breached sites in the system

A "breach" is an instance of a system having been compromised by an attacker and the data disclosed. For example, Adobe was a breach, Gawker was a breach etc. It is possible to return the details of each of breach in the system which currently stands at 649 breaches.

By version in URL (testable by clicking here):

GET https://haveibeenpwned.com/api/v3/breaches

The result set can also be filtered by passing one of the following query strings:

Parameter	Example	Test	Description
domain	?domain=adobe.com	test	Filters the result set to only breaches against the domain specified. It is possible that one site (and consequently domain), is compromised on multiple occasions.

Getting a single breached site

Sometimes just a single breach is required and this can be retrieved by the breach "name". This is the stable value which may or may not be the same as the breach "title" (which can change). See the breach model below for more info.

By version in URL (testable by clicking here):

GET https://haveibeenpwned.com/api/v3/breach/{name}

Getting all data classes in the system

A "data class" is an attribute of a record compromised in a breach. For example, many breaches expose data classes such as "Email addresses" and "Passwords". The values returned by this service are ordered alphabetically in a string array and will expand over time as new breaches expose previously unseen classes of data.

By version in URL (testable by clicking here):

GET https://haveibeenpwned.com/api/v3/dataclasses

The breach model

Each breach contains a number of attributes describing the incident. In the future, these attributes may expand without the API being versioned. The current attributes are:

Attribute	Type	Description
Name	string	A Pascal-cased name representing the breach which is unique across all other breaches. This value never changes and may be used to name dependent assets (such as images) but should not be shown directly to end users (see the "Title" attribute instead).
Title	string	A descriptive title for the breach suitable for displaying to end users. It's unique across all breaches but individual values may change in the future (i.e. if another breach occurs against an organisation already in the system). If a stable value is required to reference the breach, refer to the "Name" attribute instead.
Domain	string	The domain of the primary website the breach occurred on. This may be used for identifying other assets external systems may have for the site.
BreachDate	date	The date (with no time) the breach originally occurred on in ISO 8601 format. This is not always accurate — frequently breaches are discovered and reported long after the original incident. Use this attribute as a guide only.
AddedDate	datetime	The date and time (precision to the minute) the breach was added to the system in ISO 8601 format.
ModifiedDate	datetime	The date and time (precision to the minute) the breach was modified in ISO 8601 format. This will only differ from the AddedDate attribute if other attributes represented here are changed or data in the breach itself is changed (i.e. additional data is identified and loaded). It is always either equal to or greater then the AddedDate attribute, never less than.
PwnCount	integer	The total number of accounts loaded into the system. This is usually less than the total number reported by the media due to duplication or other data integrity issues in the source data.
Description	string	Contains an overview of the breach represented in HTML markup. The description may include markup such as emphasis and strong tags as well as hyperlinks.
DataClasses	string[]	This attribute describes the nature of the data compromised in the breach and contains an alphabetically ordered string array of impacted data classes.
IsVerified	boolean	Indicates that the breach is considered unverified. An unverified breach may not have been hacked from the indicated website. An unverified breach is still loaded into HIBP when there's sufficient confidence that a significant portion of the data is legitimate.
IsFabricated	boolean	Indicates that the breach is considered fabricated. A fabricated breach is unlikely to have been hacked from the indicated website and usually contains a large amount of manufactured data. However, it still contains legitimate email addresses and asserts that the account owners were compromised in the alleged breach.
IsSensitive	boolean	Indicates if the breach is considered sensitive. The public API will not return any accounts for a breach flagged as sensitive.
IsRetired	boolean	Indicates if the breach has been retired. This data has been permanently removed and will not be returned by the API.
IsSpamList	boolean	Indicates if the breach is considered a spam list. This flag has no impact on any other attributes but it means that the data has not come as a result of a security compromise.
IsMalware	boolean	Indicates if the breach is sourced from malware. This flag has no impact on any other attributes, it merely flags that the data was sourced from a malware campaign rather than a security compromise of an online service.
LogoPath	string	A URI that specifies where a logo for the breached service can be found. Logos are always in PNG format.

Sample breach response

All responses returns breach models either in a collection (breaches for account or all breaches in the system) or as a single item (retrieving a breach by name). When a collection is returned, it's sorted alphabetically by the title of the breach.

[
{
"Name":"Adobe",
"Title":"Adobe",
"Domain":"adobe.com",
"BreachDate":"2013-10-04",
"AddedDate":"2013-12-04T00:00Z",
"ModifiedDate":"2022-05-15T23:52Z",
"PwnCount":152445165,
"Description":"In October 2013, 153 million Adobe accounts were breached with each containing an internal ID, username, email, <em>encrypted</em> password and a password hint in plain text. The password cryptography was poorly done and many were quickly resolved back to plain text. The unencrypted hints also <a href=\"http://www.troyhunt.com/2013/11/adobe-credentials-and-serious.html\" target=\"_blank\" rel=\"noopener\">disclosed much about the passwords</a> adding further to the risk that hundreds of millions of Adobe customers already faced.",
"DataClasses":["Email addresses","Password hints","Passwords","Usernames"],
"IsVerified":true,
"IsFabricated":false,
"IsSensitive":false,
"IsRetired":false,
"IsSpamList":false,
"LogoPath":"https://haveibeenpwned.com/Content/Images/PwnedLogos/Adobe.png"
},
{
"Name":"BattlefieldHeroes",
"Title":"Battlefield Heroes",
"Domain":"battlefieldheroes.com",
"BreachDate":"2011-06-26",
"AddedDate":"2014-01-23T13:10Z",
"ModifiedDate":"2014-01-23T13:10Z",
"PwnCount":530270,
"Description":"In June 2011 as part of a final breached data dump, the hacker collective &quot;LulzSec&quot; <a href=\"http://www.rockpapershotgun.com/2011/06/26/lulzsec-over-release-battlefield-heroes-data\" target=\"_blank\" rel=\"noopener\">obtained and released over half a million usernames and passwords from the game Battlefield Heroes</a>. The passwords were stored as MD5 hashes with no salt and many were easily converted back to their plain text versions.",
"DataClasses":["Passwords","Usernames"],
"IsVerified":true,
"IsFabricated":false,
"IsSensitive":false,
"IsRetired":false,
"IsSpamList":false,
"LogoPath":"https://haveibeenpwned.com/Content/Images/PwnedLogos/BattlefieldHeroes.png"
}
]

Getting all pastes for an account

The API takes a single parameter which is the email address to be searched for. The email is not case sensitive and will be trimmed of leading or trailing white spaces. The email should always be URL encoded. This is an authenticated API and an HIBP API key must be passed with the request.

GET https://haveibeenpwned.com/api/v3/pasteaccount/{account}
hibp-api-key: [your key]

The paste model

Each paste contains a number of attributes describing it. In the future, these attributes may expand without the API being versioned. The current attributes are:

Attribute	Type	Description
Source	string	The paste service the record was retrieved from. Current values are: Pastebin, Pastie, Slexy, Ghostbin, QuickLeak, JustPaste, AdHocUrl, PermanentOptOut, OptOut
Id	string	The ID of the paste as it was given at the source service. Combined with the "Source" attribute, this can be used to resolve the URL of the paste.
Title	string	The title of the paste as observed on the source site. This may be null and if so will be omitted from the response.
Date	date	The date and time (precision to the second) that the paste was posted. This is taken directly from the paste site when this information is available but may be null if no date is published.
EmailCount	integer	The number of emails that were found when processing the paste. Emails are extracted by using the regular expression \b[a-zA-Z0-9\.\-_\+]+@[a-zA-Z0-9\.\-_]+\.[a-zA-Z]+\b

Sample paste response

Searching an account for pastes always returns a collection of the paste entity. The collection is sorted chronologically with the newest paste first.

[
{
"Source":"Pastebin",
"Id":"8Q0BvKD8",
"Title":"syslog",
"Date":"2014-03-04T19:14:54Z",
"EmailCount":139
},
{
"Source":"Pastie",
"Id":"7152479",
"Date":"2013-03-28T16:51:10Z",
"EmailCount":30
}
]

Pwned Passwords overview

Pwned Passwords are more than half a billion passwords which have previously been exposed in data breaches. The service is detailed in the launch blog post then further expanded on with the release of version 2. The entire data set is both downloadable and searchable online via the Pwned Passwords page.

Each password is stored as a SHA-1 hash of a UTF-8 encoded password. The downloadable source data delimits the full SHA-1 hash and the password count with a colon (:) and each line with a CRLF.

Searching by range

In order to protect the value of the source password being searched for, Pwned Passwords also implements a k-Anonymity model that allows a password to be searched for by partial hash. This allows the first 5 characters of a SHA-1 password hash (not case-sensitive) to be passed to the API (testable by clicking here):

GET https://api.pwnedpasswords.com/range/{first 5 hash chars}

When a password hash with the same first 5 characters is found in the Pwned Passwords repository, the API will respond with an HTTP 200 and include the suffix of every hash beginning with the specified prefix, followed by a count of how many times it appears in the data set. The API consumer can then search the results of the response for the presence of their source hash and if not found, the password does not exist in the data set. A sample response for the hash prefix "21BD1" would be as follows:

0018A45C4D1DEF81644B54AB7F969B88D65:1
00D4F6E8FA6EECAD2A3AA415EEC418D38EC:2
011053FD0102E94D6AE2F8B83D76FAF94F6:1
012A7CA357541F0AC487871FEEC1891C49C:2
0136E006E24E7D152139815FB0FC6A50B15:2
...

A range search typically returns approximately 500 hash suffixes, although this number will differ depending on the hash prefix being searched for and will increase as more passwords are added. There are 1,048,576 different hash prefixes between 00000 and FFFFF (16^5) and every single one will return HTTP 200; there is no circumstance in which the API should return HTTP 404.

Code	Body	Description
200	Hash suffixes counts	Ok — all password hashes beginning with the searched prefix are returned alongside prevalence counts

Introducing padding

In order to further enhance privacy, padding can be added to responses such that anyone able to intercept encrypted responses to the API cannot reasonably determine which hash prefix was searched for by observing the response size. Padding is enabled by a request header and ensures that all responses contain between 800 and 1,000 results regardless of the number of hash suffixes returned by the service. Read the full blog post on padding.

Header	Example	Description
Add-Padding	Add-Padding: true	Pads out responses to ensure all results contain a random number of records between 800 and 1,000.

Note: Padded entries always have a password count of 0 and can be discarded once received.

HTTPS

All API endpoints must be invoked over HTTPS. Any requests over HTTP will result in a 301 response with a redirect to the same path on the secure scheme. Only TLS versions 1.2 and 1.3 are supported; older versions of the protocol will not allow a connection to be made.

Response codes

Semantic HTTP response codes are used to indicate the result of the search:

Code	Description
200	Ok — everything worked and there's a string array of pwned sites for the account
400	Bad request — the account does not comply with an acceptable format (i.e. it's an empty string)
401	Unauthorised — either no API key was provided or it wasn't valid
403	Forbidden — no user agent has been specified in the request
404	Not found — the account could not be found and has therefore not been pwned
429	Too many requests — the rate limit has been exceeded
503	Service unavailable — usually returned by Cloudflare if the underlying service is not available

Test accounts

Test accounts exist to demonstrate different behaviours. All accounts are on the domain "hibp-integration-tests.com", for example "account-exists@hibp-integration-tests.com".

Alias	Description
account-exists	Returns one breach and one paste.
multiple-breaches	Returns three breaches.
not-active-and-active-breach	Returns one breach being "Adobe". An inactive breach also exists against this account in the underlying data structure.
not-active-breach	Returns no breaches. An inactive data breach also exists against this account in the underlying data structure.
opt-out	Returns no breaches and no pastes. This account is opted-out of both pastes and breaches in the underlying data structure.
opt-out-breach	Returns no breaches and no pastes. This account is opted-out of breaches in the underlying data structure.
paste-sensitive-breach	Returns no breaches and one paste. A sensitive breach exists against this account in the underlying data structure.
permanent-opt-out	Returns no breaches and no pastes. This account is permanently opted-out of both breaches and pastes in the underlying data structure.
sensitive-and-other-breaches	Returns two non-sensitive breaches and no pastes. A sensitive breach exists against this account in the underlying data structure.
sensitive-breach	Returns no breaches and no pastes. A sensitive breach exists against this account in the underlying data structure.
unverified-breach	Returns one unverified breach and no pastes.

Cross-origin resource sharing (CORS)

CORS is only supported for non-authenticated APIs. When supported, it accepts all origins — you can hit the API from websites on any other domain.

Rate limiting

Requests to the breaches and pastes APIs are limited. The rate limits depends on the the API key you've purchased. Any request that exceeds the limit will receive an HTTP 429 "Too many requests" response. The response also includes an accompanying "retry-after" response header expressing the number of seconds remaining before the client can make a successful API call with the same key (the value is rounded up to the next whole second). The response body explains the rate limit and refers to the acceptable use documentation.

A typical response looks like this:

HTTP/1.1 429
retry-after: 2
{ "statusCode": 429, "message": "Rate limit is exceeded. Try again in 2 seconds." }

The retry period is sliding; attempting to query the API more aggressively than the rate allows causes the retry period to start again with each failed request. It's advisable to avoid querying the API at exactly the rate limit as network behaviour may result in some requests arriving within the retry period and causing a 429. Adding an additional 100 millisecond delay between requests on top of the rate limit will usually ensure this won't happen.

Where the rate limit is consistently exceeded, further defences may be employed to limit the ability to query the API. These defences include blocks or JavaScript challenges by Cloudflare which may result in an HTTP 503 "Service Unavailable" response.

There is no rate limit on the Pwned Passwords API.

Abuse

There's not much point; if you want to build up a treasure trove of pwned email addresses or usernames, go and download the dumps (they're usually just a Google search away) and save yourself the hassle and time of trying to enumerate an API one account at a time. That said, use of the API should fall within acceptable use expectations:

Acceptable use

The API has been designed to make it easy for people to do awesome things with it. Things that are not awesome include:

Querying the data for purposes that are intended to cause harm to the victims of data breaches
Anything deliberately intended to limit service availability such as denial of service attacks
Deliberate attempts to circumvent measures designed to ensure acceptable use
Not properly identifying the user agent such that it accurately describes the consumer of the API
Misrepresenting the consuming client by impersonating other user agents in an attempt to obfuscate API requests
Other services designed to fraudulently represent the Have I Been Pwned name or brand
Misrepresenting the source of the data as originating from somewhere other than Have I Been Pwned
Not adhering to the Creative Commons Attribution License as described below
Automating the consumption of other APIs not explicitly documented on this page
Using the service in a fashion that brings Have I Been Pwned into disrepute

Abusing these objectives may limit your ability to query the service via a range of countermeasures. Those countermeasures may impact other consumers of the API if they share network services with an abusive user. If in doubt, get in touch and outline how you'd like to use the service in a way that's consistent with these objectives.

License — breach & paste APIs

This work is licensed under a Creative Commons Attribution 4.0 International License.

In other words, you're welcome to use the public API to build other services, but you must identify Have I Been Pwned as the source of the data . Clear and visible attribution with a link to haveibeenpwned.com should be present anywhere data from the service is used including when searching breaches or pastes and when representing breach descriptions. It doesn't have to be overt, but the interface in which Have I Been Pwned data is represented should clearly attribute the source per the Creative Commons Attribution 4.0 International License.

In order to help maximise adoption, there is no licencing or attribution requirements on the Pwned Passwords API, although it is welcomed if you would like to include it.

API v3

The API allows the list of pwned accounts (email addresses and usernames) to be quickly searched via a RESTful service.

Overview

Index

Authorisation

Specifying the API version

Versioning via the URL

Specifying the user agent

Getting all breaches for an account

Getting all breached sites in the system

Getting a single breached site

Getting all data classes in the system

The breach model

Sample breach response

Getting all pastes for an account

The paste model

Sample paste response

Pwned Passwords overview

Searching by range

Introducing padding

HTTPS

Response codes

Test accounts

Cross-origin resource sharing (CORS)

Rate limiting

Abuse

Acceptable use

License — breach & paste APIs