Supported Entity Types
A list of supported entity types.
Deepgram can detect, format, and redact over 50 unique entity types. The complete inventory of supported entity types is listed in the charts below, divided into four groups: PII (Personally Identifiable Information), PHI (Protected Health Information), PCI (Payment Card Industry), and Other Entities.
Note that some entities, such as name
and location
, also have subtypes. For instance, location_city
is a subtype of location
. This means that, in a phrase such as I live in Boston, the location name Boston will be detected as both location
and location_city
, with the more specific label (in this case, location_city
) appearing in the output. Other entity types are groupings of related categories. For example, healthcare_number
captures health plan beneficiary numbers and medical record numbers, both of which are outlined as identifiers in the HIPAA Safe Harbor provision. Similarly, numerical_pii
covers a broad range of entity types such as MAC addresses and cookie IDs.
While entity types have English names, international variants are also redacted. For example, ssn
covers American Social Security Numbers, as well as many equivalent identification numbers used in different regions worldwide, such as the Canadian Social Insurance Number or the German Sozialversicherungsnummer.
Redacting Certain Entities
When using Deepgram’s hosted pre-recorded product, our redaction functionality can redact over 50 unique entity types.
Individual entity classes can be redacted with redact=entity_class
.
This functionality is only available for Deepgram’s hosted and pre-recorded transcription product. When using Deepgram’s self-hosted offering or live streaming product, only the basic Redaction functionality is available.