How to Find Personal Information in Text

Last Updated: January 30, 2026

What Does the Analyzer Do?#

The Analyzer is like a detective. It reads through text and finds personal information that should be kept private. This could be things like:

  • Someone's name
  • An email address
  • A phone number
  • A credit card number
  • A home address

This is useful because sometimes you need to share documents but you want to hide the private stuff first.


How to Use the Analyzer#

Step 1: Open the Analyzer#

  1. Go to anonym.legal and sign in
  2. Click "Analyzer" in the menu
  3. You will see a page with:
    • A big text box (where you paste your text)
    • Checkboxes to pick what to look for
    • A language picker
    • A slider for how careful the tool should be
    • An "Analyze" button

What Are Entity Types?#

Entity types are the different kinds of personal information the tool can find. Think of them like categories.

People Information#

  • PERSON: Names like "John Smith" or "Maria Garcia"
  • EMAIL_ADDRESS: Email addresses like "john@email.com"
  • PHONE_NUMBER: Phone numbers in any format
  • DATE_TIME: Dates like "January 15, 2026" or times
  • AGE: When someone's age is mentioned
  • GENDER: Words that tell someone's gender

Money Information#

  • CREDIT_CARD: Credit card numbers (with or without dashes)
  • IBAN: Bank account numbers used in Europe
  • US_SSN: Social Security numbers (used in the USA)
  • TAX_ID: Tax identification numbers

Location Information#

  • LOCATION: Addresses, cities, and countries
  • IP_ADDRESS: Computer addresses (like 192.168.1.1)

Medical Information#

  • MEDICAL_LICENSE: Doctor's license numbers
  • MEDICAL_RECORD_NUMBER: Hospital patient numbers

Organization Information#

  • ORGANIZATION: Company names like "Apple" or "Nike"

The tool can find 24+ different types of personal information!

You can also create your own custom types if you need to find something special.


Language Support#

The Analyzer works with 48 different languages! Here are some examples:

  • English
  • Spanish
  • French
  • German
  • Chinese
  • Japanese
  • Arabic
  • Hindi
  • Russian
  • Portuguese
  • Italian
  • And many more!

Good to know: The first time you use a new language, it might take a few extra seconds. After that, it will be fast.


What is the Confidence Threshold?#

The confidence threshold is like asking "How sure do you want the tool to be?"

Imagine you are looking for your friend named "Jordan" in a crowd. You might see someone who looks a little like Jordan, or someone who looks exactly like Jordan.

  • Low threshold (0.3): "Find anyone who might be Jordan" - You will find more people, but some might not actually be Jordan
  • Medium threshold (0.5): "Find people who probably are Jordan" - Good balance
  • High threshold (0.8): "Only show me if you are really sure it is Jordan" - Fewer results, but more accurate

What Number Should You Use?#

  • Start with 0.5 - This works well for most people
  • Use 0.3-0.5 if you want to catch everything (even if some are wrong)
  • Use 0.7-0.9 if you only want results the tool is very sure about

What Are Presets?#

Presets are like recipe cards. Instead of picking every setting yourself, you can use a preset that already has the right settings for common tasks.

Ready-Made Presets#

  1. General PII Detection

    • Good for everyday use
    • Finds names, emails, phone numbers, addresses, credit cards
  2. GDPR Compliance

    • For European privacy rules
    • Works with multiple languages
  3. HIPAA Medical

    • For hospital and doctor information
    • Very careful and accurate
  4. Financial Services

    • For banks and money stuff
    • Finds credit cards, bank accounts, Social Security numbers
  5. Development & Testing

    • For software developers
    • Good for checking test data
  6. Multi-Language European

    • For documents in European languages
    • Finds privacy information in multiple languages

How to Use a Preset#

  1. Click the "Preset" dropdown menu
  2. Pick a preset from the list
  3. All the settings are filled in automatically
  4. You can still change settings after if you want

How Tokens Work#

Using the Analyzer costs tokens. Think of tokens like arcade coins - you use them to play the game.

How Many Tokens Will I Use?#

It depends on:

  • How much text you have
  • How many entity types you picked
  • How much personal information is found

Typical Token Costs#

Amount of TextTokens Used
Short text (a paragraph)1-3 tokens
Medium text (a page)3-6 tokens
Long text (several pages)6-10 tokens

The tool shows you an estimate before you click "Analyze" so you know what to expect.


Understanding Your Results#

After the tool analyzes your text, you will see:

  1. Highlighted text - The personal information is marked in different colors
  2. A results list showing:
    • What type of information was found
    • Where in the text it was found
    • How sure the tool is (the confidence score)
    • The actual text that was found

Example#

If your text says: "John Doe's email is john@example.com"

The results would look like this:

TypeWhat Was FoundHow Sure
PERSONJohn Doe95% sure
EMAIL_ADDRESSjohn@example.com98% sure

Tips for Best Results#

  1. Start with a preset - It is the easiest way to begin
  2. Only pick what you need - Do not select every entity type. Pick only the ones you are looking for
  3. Test with a small sample first - Try a little bit of text before doing a lot
  4. Check your results - Always look at what the tool found to make sure it is correct
  5. Pick the right language - Make sure you select the language your text is written in

Fixing Problems#

Problem: Nothing Was Found#

Why this might happen:

  • The confidence threshold is too high (the tool is being too picky)
  • You picked the wrong language
  • You did not select the right entity types
  • Your text does not have the type of information you are looking for

What to try:

  • Lower the confidence threshold slider
  • Check that you picked the correct language
  • Make sure you selected the right entity types
  • Try different entity types

Problem: Too Many Wrong Results#

Why this might happen:

  • The confidence threshold is too low (the tool is guessing too much)
  • You selected too many entity types

What to try:

  • Move the confidence threshold slider higher
  • Only select the entity types you really need

Problem: The Tool is Slow#

Why this might happen:

  • This is the first time using a new language (it needs to load)
  • Your text is very long
  • Slow internet connection

What to try:

  • Wait a moment for the language to load (only slow the first time)
  • Try smaller pieces of text
  • Check your internet connection

Problem: Credit Cards Are Not Found#

Make sure:

  • The CREDIT_CARD entity type is selected
  • Try lowering the confidence threshold to 0.4
  • The credit card number in your text is a real format

Credit cards can look like:

  • 4532-1234-5678-9010 (with dashes)
  • 4532 1234 5678 9010 (with spaces)
  • 4532123456789010 (just numbers)

More Help#

Check out these other guides:


Last Updated: January 30, 2026