> ## Documentation Index
> Fetch the complete documentation index at: https://docs.scanoss.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Declaring Components

> SCANOSS provides a settings file to customise the scanning process.

The [`scanoss.json`](https://github.com/scanoss/schema) file contains project information and
BOM (Bill of Materials) rules. It allows you to include, remove, or replace components in the
BOM before and after scanning.

For an interactive, browsable view of the schema with detailed field descriptions and examples,
visit the [SCANOSS Schema Documentation](https://scanoss.github.io/schema).

## Overview

The `scanoss.json` settings file controls the behaviour of the SCANOSS scanner for your project.
It allows you to:

* **Define scan scope** — specify which files are included in or excluded from scanning and fingerprinting.
* **Configure detection behaviour** — tune how components and snippets are identified in your codebase.
* **Set proxy and HTTP options** — configure network settings for SCANOSS API requests.
* **Manage BOM rules** — include, remove, or replace components in scan results before or after scanning.

## Managing the `scanoss.json` File

### Manual Review Workflow

The `scanoss.json` file should be reviewed and updated manually after each scan. A human should evaluate whether the identified component details are valid and appropriate for your project before making changes.

**Recommended workflow:**

1. Run a scan
2. Review the undeclared componets
3. Evaluate whether each component is accurate and appropriate for declaration
4. Manually add validated entries to `scanoss.json`
5. Re-run the scan to confirm the declared components

### Automatic Generation

There is currently no built-in mechanism to automatically insert entries directly into `scanoss.json`. The file is intended to be managed manually, with human review of each declaration. If you need to automate data retrieval and insertion, you can do so via custom scripting.

## Schema Overview

You can explore the complete schema interactively at [scanoss.github.io/schema](https://scanoss.github.io/schema),
which provides detailed field descriptions, validation rules, and examples.

Download a sample settings file here: <a href="scanoss-settings-schema.json" download>scanoss-settings-schema.json</a>

The settings file consists of two main sections:

### Project Information

The `self` section contains basic information about your project:

```json theme={null}
{
  "self": {
    "name": "my-project",
    "license": "MIT",
    "description": "Project description"
  }
}
```

### Settings

The `settings` object configures various aspects of the scanning process. It includes file
filtering, network configuration, and snippet matching tuning parameters.

**Settings Structure:**

```json theme={null}
{
  "settings": {
    "skip": {
      // File filtering configuration
    },
    "proxy": {
      // Root-level proxy configuration
    },
    "http_config": {
      // Root-level HTTP configuration
    },
    "file_snippet": {
      // Snippet-specific tuning parameters
    },
    "hpfm": {
      // High Precision Folder Matching settings
    }
  }
}
```

#### Skip Configuration

The `skip` object defines rules for excluding files from scanning or fingerprinting. This can
improve scan performance and avoid unnecessary processing.

> **Note:** Patterns use the same syntax as `.gitignore` files. For details, refer to the
> [gitignore pattern documentation](https://git-scm.com/docs/gitignore).

##### Properties

**`skip.patterns.scanning`** — A list of file patterns to exclude from scanning.

| **Property** | **Description**  |
| ------------ | ---------------- |
| **Type**     | Array of strings |
| **Required** | No               |

**Example:**

```json theme={null}
{
  "settings": {
    "skip": {
      "patterns": {
        "scanning": [
          "*.log",
          "!important.log",
          "temp/",
          "debug[0-9]*.txt",
          "src/client/specific-file.js",
          "src/nested/folder/"
        ]
      }
    }
  }
}
```

**`skip.patterns.fingerprinting`** — A list of patterns specifying which files should be
skipped during fingerprinting.

| **Property** | **Description**  |
| ------------ | ---------------- |
| **Type**     | Array of strings |
| **Required** | No               |

**Example:**

```json theme={null}
{
  "settings": {
    "skip": {
      "patterns": {
        "fingerprinting": [
          "*.log",
          "!important.log",
          "temp/",
          "debug[0-9]*.txt",
          "src/client/specific-file.js",
          "src/nested/folder/"
        ]
      }
    }
  }
}
```

**`skip.patterns.dependencies`** — A list of file patterns to exclude from dependency scanning.
Use this to prevent specific manifest files or directories from being parsed for declared
dependencies.

| **Property** | **Description**  |
| ------------ | ---------------- |
| **Type**     | Array of strings |
| **Required** | No               |

**Example:**

```json theme={null}
{
  "settings": {
    "skip": {
      "patterns": {
        "dependencies": ["vendor/**", "third_party/"]
      }
    }
  }
}
```

**`skip.sizes.scanning`** — Rules for skipping files based on their size during scanning.

| **Property** | **Description** |
| ------------ | --------------- |
| **Type**     | Object          |
| **Required** | No              |

**Properties:**

* **`patterns`** *(array of strings)* — List of glob patterns to which the size rule applies.
* **`min`** *(integer)* — Minimum file size in bytes.
* **`max`** *(integer, required)* — Maximum file size in bytes.

**Example:**

```json theme={null}
{
  "settings": {
    "skip": {
      "sizes": {
        "scanning": [
          {
            "patterns": [
              "*.log",
              "!important.log",
              "temp/",
              "debug[0-9]*.txt",
              "src/client/specific-file.js",
              "src/nested/folder/"
            ],
            "min": 100,
            "max": 1000000
          }
        ]
      }
    }
  }
}
```

**`skip.sizes.fingerprinting`** — Rules for skipping files based on their size during
fingerprinting.

| **Property** | **Description** |
| ------------ | --------------- |
| **Type**     | Object          |
| **Required** | No              |

**Properties:**

* **`patterns`** *(array of strings)* — List of glob patterns to which the size rule applies.
* **`min`** *(integer)* — Minimum file size in bytes.
* **`max`** *(integer, required)* — Maximum file size in bytes.

**Example:**

```json theme={null}
{
  "settings": {
    "skip": {
      "sizes": {
        "fingerprinting": [
          {
            "patterns": [
              "*.log",
              "!important.log",
              "temp/",
              "debug[0-9]*.txt",
              "src/client/specific-file.js",
              "src/nested/folder/"
            ],
            "min": 100,
            "max": 1000000
          }
        ]
      }
    }
  }
}
```

##### Pattern Format Rules

* Patterns are matched relative to the scan root directory.
* A trailing slash indicates a directory (e.g., `path/` matches only directories).
* A single asterisk `*` matches any character except a slash.
* Two asterisks `**` match zero or more path segments (e.g., `path/**/file.js` matches
  `path/file.js`, `path/to/file.js`, and `path/to/nested/file.js`).
* Range notation such as `[0-9]` matches any single character within the specified range.
* A question mark `?` matches any single character except a slash.

**Example:**

```bash theme={null}
# Match all .txt files
*.txt

# Match all .log files except important.log
*.log
!important.log

# Match all files in the build directory
build/

# Match all .pdf files in docs directory and its subdirectories
docs/**/*.pdf

# Match files like test1.js, test2.js, etc.
test[0-9].js
```

##### Complete Skip Example

A comprehensive example combining pattern-based and size-based skip rules:

```json theme={null}
{
  "settings": {
    "skip": {
      "patterns": {
        "scanning": [
          "# Node.js dependencies",
          "node_modules/",

          "# Build outputs",
          "dist/",
          "build/"
        ],
        "fingerprinting": [
          "# Logs except important ones",
          "*.log",
          "!important.log",

          "# Temporary files",
          "temp/",
          "*.tmp",

          "# Debug files with numbers",
          "debug[0-9]*.txt",

          "# All test files in any directory",
          "**/*test.js"
        ]
      },
      "sizes": {
        "scanning": [
          {
            "patterns": ["*.log", "!important.log"],
            "min": 512,
            "max": 5242880
          }
        ],
        "fingerprinting": [
          {
            "patterns": [
              "temp/",
              "*.tmp",
              "debug[0-9]*.txt",
              "src/client/specific-file.js",
              "src/nested/folder/"
            ],
            "min": 512,
            "max": 5242880
          }
        ]
      }
    }
  }
}
```

#### Proxy Configuration

Root-level proxy configuration applied to all SCANOSS API requests. This can be overridden
at the snippet level using `file_snippet.proxy`.

**`settings.proxy`**

| **Property** | **Description** |
| ------------ | --------------- |
| **Type**     | Object          |
| **Required** | No              |

**Properties:**

* **`host`** *(string, required)* — Proxy server URL, including protocol and port.

**Example:**

```json theme={null}
{
  "settings": {
    "proxy": {
      "host": "http://file-snippet-proxy:8080"
    }
  }
}
```

#### HTTP Configuration

Root-level HTTP configuration applied to all SCANOSS API requests. This can be overridden
at the snippet level using `file_snippet.http_config`.

**`settings.http_config`**

| **Property** | **Description** |
| ------------ | --------------- |
| **Type**     | Object          |
| **Required** | No              |

**Properties:**

* **`base_uri`** *(string)* — Base API endpoint URL.
* **`ignore_cert_errors`** *(boolean)* — When set to `true`, SSL certificate validation errors
  are ignored. **Use with caution — this should not be enabled in production environments.**

**Example:**

```json theme={null}
{
  "settings": {
    "http_config": {
      "base_uri": "https://root-api.scanoss.com",
      "ignore_cert_errors": false
    }
  }
}
```

#### Snippet Matching Tuning

The `file_snippet` section provides fine-grained control over snippet matching behaviour.
These settings allow you to reduce false positives and improve match accuracy for your codebase.

##### `file_snippet.min_snippet_hits`

Minimum number of snippet hits required for a match to be considered valid. Higher values
reduce false positives by requiring more evidence before a match is reported.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "min_snippet_hits": 5
    }
  }
}
```

##### `file_snippet.min_snippet_lines`

Minimum number of lines a snippet must span to be considered a valid match. Filters out
short matches that are unlikely to be meaningful, such as single-line imports or common
boilerplate.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "min_snippet_lines": 3
    }
  }
}
```

##### `file_snippet.ranking_enabled`

Controls whether origin project score quality is taken into account during matching.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "ranking_enabled": true
    }
  }
}
```

##### `file_snippet.ranking_threshold`

Sets the minimum ranking score (`0–10`) required for a match to be reported. Higher values
return only higher-confidence matches. Set to `-1` to use the server's default threshold.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "ranking_threshold": 7
    }
  }
}
```

##### `file_snippet.honour_file_exts`

Controls whether file extensions are taken into account during matching. Set to `false` when
files have been renamed or use non-standard extensions.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "honour_file_exts": false
    }
  }
}
```

##### `file_snippet.skip_headers`

Skips licence headers, comments, and imports at the beginning of files. Helps avoid false
matches on standard boilerplate that appears across many files.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "skip_headers": true
    }
  }
}
```

##### `file_snippet.skip_headers_limit`

Maximum number of lines to skip when `skip_headers` is enabled. Controls how much of the
beginning of each file is excluded from matching.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "skip_headers": true,
      "skip_headers_limit": 50
    }
  }
}
```

##### `file_snippet.dependency_analysis`

Enables dependency analysis during scanning. When set to `true`, the scanner analyses and
reports on dependencies declared in manifest files.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "dependency_analysis": true
    }
  }
}
```

##### `file_snippet.proxy`

Snippet-specific proxy configuration that overrides the root-level `settings.proxy`. Use
this when snippet scanning requires a different proxy than other API operations.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "proxy": {
        "host": "http://snippet-proxy.example.com:8080"
      }
    }
  }
}
```

##### `file_snippet.http_config`

Snippet-specific HTTP configuration that overrides the root-level `settings.http_config`.
Use this when snippet scanning requires different API endpoints or certificate handling.

**Example:**

```json theme={null}
{
  "settings": {
    "file_snippet": {
      "http_config": {
        "base_uri": "https://snippet-api.scanoss.com",
        "ignore_cert_errors": true
      }
    }
  }
}
```

#### High Precision Folder Matching (HPFM)

HPFM settings control matching behaviour for entire folder structures rather than individual
files. This is useful for detecting when large portions of a codebase match known components.

##### `hpfm.ranking_enabled`

Enables ranking for folder-level matching operations. When set to `true`, folder matches are
ranked by confidence score.

**Example:**

```json theme={null}
{
  "settings": {
    "hpfm": {
      "ranking_enabled": true
    }
  }
}
```

##### `hpfm.ranking_threshold`

Specifies the minimum ranking score (`0–10`) required for a folder-level match to be included
in results. Set to `-1` to use the server's default threshold.

**Example:**

```json theme={null}
{
  "settings": {
    "hpfm": {
      "ranking_threshold": 8
    }
  }
}
```

### BOM

The `bom` section defines rules for modifying the Bill of Materials before and after scanning.
It supports three operations: `include`, `exclude`, `remove`, and `replace`.

BOM rules can target either individual files or entire folders. Both
`path` and `purl` fields support folder-level targeting, enabling you to apply rules to all
files within a directory.

#### Include Rules

Specifies components to be passed to the SCANOSS API as additional context during scanning.
These hints inform the API of expected components, increasing the likelihood that they appear
in scan results.

**File-level include:**

```json theme={null}
{
  "bom": {
    "include": [
      {
        "path": "src/lib/component.js",
        "purl": "pkg:npm/vue@2.6.12",
        "comment": "Optional comment"
      }
    ]
  }
}
```

**Folder-level include:**

```json theme={null}
{
  "bom": {
    "include": [
      {
        "path": "src/vendor/",
        "purl": "pkg:npm/lodash@4.17.21",
        "comment": "All files under src/vendor/ are identified as lodash"
      }
    ]
  }
}
```

#### Exclude Rules

Specifies components to be excluded from scan results while still allowing files to be scanned. These rules act as blacklist context for matching files.

**File-level exclude:**

```json theme={null}
{
  "bom": {
    "exclude": [
      {
        "path": "src/lib/component.js",
        "purl": "pkg:npm/vue@2.6.12",
        "comment": "Exclude specific detection"
      }
    ]
  }
}
```

**Folder-level exclude:**

```json theme={null}
{
  "bom": {
    "exclude": [
      {
        "path": "src/vendor/",
        "purl": "pkg:npm/lodash@4.17.21",
        "comment": "Exclude lodash detections under vendor"
      }
    ]
  }
}
```

#### Remove Rules

Specifies components to be removed from scan results during post-processing. These rules are
applied client-side after scanning is complete. Folder-level remove rules act as a blacklist
context for all files within the specified directory.

**File-level remove:**

```json theme={null}
{
  "bom": {
    "remove": [
      {
        "path": "src/lib/component.js",
        "purl": "pkg:npm/vue@2.6.12",
        "comment": "Optional comment"
      }
    ]
  }
}
```

**Folder-level remove:**

```json theme={null}
{
  "bom": {
    "remove": [
      {
        "path": "src/internal/",
        "comment": "Exclude all files under src/internal/ from BOM results"
      }
    ]
  }
}
```

#### Replace Rules

Specifies components to be replaced in scan results during post-processing. These rules are
applied client-side after scanning is complete. The `replace_with` PURL is also contributed
to the scan context, improving server-side matching for the affected files.

When a `license` field is specified in a replace rule, it **overrides** any existing licence
on the replaced component. If no `license` is specified, any licence from the original match
is cleared during replacement.

**File-level replace:**

```json theme={null}
{
  "bom": {
    "replace": [
      {
        "path": "src/utils/helper.js",
        "purl": "pkg:npm/old-lib@1.0.0",
        "replace_with": "pkg:npm/new-lib@2.0.0",
        "license": "MIT",
        "comment": "Optional comment"
      }
    ]
  }
}
```

**Folder-level replace:**

```json theme={null}
{
  "bom": {
    "replace": [
      {
        "path": "src/legacy/",
        "purl": "pkg:npm/old-lib@1.0.0",
        "replace_with": "pkg:npm/new-lib@2.0.0",
        "license": "MIT",
        "comment": "Replace all old-lib matches under src/legacy/ with new-lib"
      }
    ]
  }
}
```

### Matching Priority

When multiple BOM rules could apply to the same file, the following priority order is used
(highest to lowest):

1. **Path + PURL** — both fields match the file path and detected PURL
2. **PURL only** — rule matches on PURL regardless of path
3. **Path only** — rule matches on path regardless of PURL

When two path-only rules could match the same file, the rule with the **longer (more
specific) path wins**. Trailing slashes are treated equivalently (`src/vendor/` and
`src/vendor` match the same set of files).

### Important Notes

**Matching Rules**

* **Full match** — Requires both `path` and `purl` to match. The rule applies only to the
  specific file (or files within the folder) at the given path with the matching PURL.
* **Partial match** — Matches on either:
  * `path` only (`purl` is omitted) — the rule applies to all files at or under the matching path.
  * `purl` only (`path` is omitted) — the rule applies to all files with the matching PURL.

**File Paths**

* All paths must be specified relative to the scanned directory.
* Use forward slashes (`/`) as path separators.
* A trailing slash indicates a folder-level rule (`src/vendor/` targets all files inside
  `src/vendor/`).

Given the following example directory structure:

```bash theme={null}
project/
├── src/
│   └── component.js
└── lib/
    └── utils.py
```

If the scanned directory is `/project/src`, then:

* `component.js` is a valid path.
* `lib/utils.py` is an invalid path and will not match any files.

If the scanned directory is `/project`, then:

* `src/component.js` is a valid path.
* `lib/utils.py` is a valid path.
* `src/` targets all files under `src/`.

**Package URLs (PURLs)**

PURLs must follow the [Package URL specification](https://github.com/package-url/purl-spec):

* Format: `pkg:<type>/<namespace>/<name>@<version>`
* Examples:
  * `pkg:npm/vue@2.6.12`
  * `pkg:golang/github.com/golang/go@1.17.3`
* Must be valid and include all required components.
* A version is strongly recommended but is optional.

## Full Example

A complete configuration showing all available settings:

```json theme={null}
{
  "self": {
    "name": "example-project",
    "license": "Apache-2.0",
    "description": "Example project with advanced snippet tuning"
  },
  "settings": {
    "skip": {
      "patterns": {
        "scanning": ["node_modules/", "dist/", "build/"],
        "fingerprinting": [
          "*.log",
          "!important.log",
          "temp/",
          "*.tmp",
          "debug[0-9]*.txt",
          "**/*test.js"
        ],
        "dependencies": ["vendor/**", "third_party/"]
      },
      "sizes": {
        "scanning": [
          {
            "patterns": ["*.log", "!important.log"],
            "min": 512,
            "max": 5242880
          }
        ],
        "fingerprinting": [
          {
            "patterns": [
              "temp/",
              "debug[0-9]*.txt",
              "src/client/specific-file.js"
            ],
            "min": 512,
            "max": 5242880
          }
        ]
      }
    },
    "proxy": {
      "host": "http://corporate-proxy.example.com:8080"
    },
    "http_config": {
      "base_uri": "https://api.scanoss.com",
      "ignore_cert_errors": false
    },
    "file_snippet": {
      "min_snippet_hits": 5,
      "min_snippet_lines": 3,
      "ranking_enabled": true,
      "ranking_threshold": 7,
      "honour_file_exts": true,
      "skip_headers": true,
      "skip_headers_limit": 50,
      "dependency_analysis": true
    },
    "hpfm": {
      "ranking_enabled": true,
      "ranking_threshold": 8
    }
  },
  "bom": {
    "include": [
      {
        "path": "src/lib/component.js",
        "purl": "pkg:npm/lodash@4.17.21",
        "comment": "File-level include"
      },
      {
        "path": "src/vendor/",
        "purl": "pkg:npm/lodash@4.17.21",
        "comment": "Folder-level include"
      }
    ],
    "exclude": [
      {
        "path": "src/lib/component.js",
        "purl": "pkg:npm/vue@2.6.12",
        "comment": "File-level exclude"
      },
      {
        "path": "src/vendor/",
        "purl": "pkg:npm/lodash@4.17.21",
        "comment": "Folder-level exclude"
      }
    ],
    "remove": [
      {
        "purl": "pkg:npm/deprecated-pkg@1.0.0",
        "comment": "File-level remove"
      },
      {
        "path": "src/internal/",
        "comment": "Folder-level remove"
      }
    ],
    "replace": [
      {
        "path": "src/utils/helper.js",
        "purl": "pkg:npm/old-lib@1.0.0",
        "replace_with": "pkg:npm/new-lib@2.0.0",
        "license": "MIT",
        "comment": "File-level replace"
      },
      {
        "path": "src/legacy/",
        "purl": "pkg:npm/old-lib@1.0.0",
        "replace_with": "pkg:npm/new-lib@2.0.0",
        "license": "MIT",
        "comment": "Folder-level replace"
      }
    ]
  }
}
```
