Search & Fill
Connect local data sources or Fill By Link respondent records, search a record, and populate mapped fields with predictable behavior.
How to use this docs page
This page is meant to answer one operational stage of the DullyPDF workflow well enough that you can run a controlled test without guessing. Read the sections below, validate the behavior against one representative document, and only then move to the next linked page.
That order matters because most setup failures come from mixing detection, mapping, fill validation, and sharing into one unstructured pass. A narrower review loop keeps troubleshooting faster and makes the template easier to trust once you save it for reuse.
Fill from file demo
Fill PDF from CSV, Excel, JSON, SQL, or TXT
This walkthrough shows how to load a saved PDF template and auto-fill it from a CSV, Excel (XLSX), JSON, SQL query result, or TXT data source without leaving the browser.
Use this video when you need to prove that DullyPDF can fill the same template from CSV, XLSX, JSON, SQL, and TXT payloads before rolling the workflow out to the rest of the team.
Data source support
The data source dropdown accepts five file types. CSV, Excel, and JSON provide both schema headers and record rows for Search & Fill. SQL and TXT provide schema columns only (for mapping) — they do not include row data unless the SQL file also contains INSERT INTO statements.
- Fill By Link respondent submissions are stored as structured records and can be selected from the workspace just like local rows.
- CSV/XLSX/JSON/SQL parsers cap records at 5,000 rows per import.
- Duplicate headers are auto-renamed with numeric suffixes (
name,name_2,name_3, ...). - Header normalization trims whitespace, converts to lowercase, replaces spaces and hyphens with underscores, and removes other punctuation.
- Schema type inference samples up to 200 rows when detecting column types automatically.
- Allowed column types across all formats:
string,int,date,bool.
CSV file format
CSV files follow the RFC 4180 standard. The first row is treated as the header row. All subsequent rows become searchable records.
Example CSV:
first_name,last_name,dob,email,phone John,Smith,1990-05-14,john@example.com,555-0100 Jane,Doe,1985-11-02,jane@example.com,555-0200 Bob,Johnson,1978-03-21,bob@example.com,555-0300
- Default delimiter is comma. Quoted fields with escaped double-quotes are handled.
- Both
\r\nand\nline endings are supported. - BOM (byte-order mark) is stripped automatically.
- Empty headers are filtered out. Empty rows (all cells blank) are skipped.
- Maximum 5,000 rows per import. All values are stored as strings.
- Column types (string, int, date, bool) are inferred automatically from the first 200 rows of data.
JSON file format
JSON files can use several structures. DullyPDF auto-detects the layout and extracts headers and rows.
Simplest format — array of objects (recommended):
[
{ "first_name": "John", "last_name": "Smith", "dob": "1990-05-14" },
{ "first_name": "Jane", "last_name": "Doe", "dob": "1985-11-02" }
]Nested structure with explicit schema:
{
"fields": [
{ "name": "first_name", "type": "string" },
{ "name": "dob", "type": "date" }
],
"rows": [
{ "first_name": "John", "dob": "1990-05-14" }
]
}- Accepted top-level array keys for rows:
rows,records,data,items,entries. - Accepted schema keys:
schema.fields,fields,columns,headers. - Each field entry can be a string name or an object with
name,field,column, oridproperty. - Nested objects are flattened with underscore separators (for example
address.citybecomesaddress_city) up to 6 levels deep. - A single top-level object (not an array) is treated as one record row.
- JSONL format (one JSON object per line) is also supported.
- Maximum 5,000 rows. Type inference works the same as CSV when no explicit schema types are provided.
SQL file format
SQL files are parsed for CREATE TABLE statements to extract column names and types. If the file also contains INSERT INTO statements, those rows are extracted as searchable records.
Schema-only example:
CREATE TABLE patients ( mrn VARCHAR(20) NOT NULL, first_name VARCHAR(100), last_name VARCHAR(100), date_of_birth DATE, is_active BOOLEAN DEFAULT true, age INT );
Schema with data rows:
CREATE TABLE patients (
mrn VARCHAR(20),
first_name VARCHAR(100),
last_name VARCHAR(100)
);
INSERT INTO patients VALUES
('MRN001', 'John', 'Smith'),
('MRN002', 'Jane', 'Doe');- SQL type mapping:
VARCHAR,CHAR,TEXT,UUID,JSON,ENUM→string;INT,INTEGER,SMALLINT,BIGINT,SERIAL,NUMERIC,DECIMAL,FLOAT,DOUBLE→int;DATE,DATETIME,TIMESTAMP,TIME→date;BOOLEAN,BOOL,BIT→bool. - Precision and length qualifiers are stripped (
VARCHAR(255)→VARCHAR). - Quoted identifiers (backtick or double-quote) are handled.
- Constraint lines (
PRIMARY KEY,UNIQUE,INDEX,FOREIGN KEY,CHECK) are filtered out. - SQL comments (
-- lineand/* block */) are stripped before parsing. - Multiple
CREATE TABLEstatements are merged; duplicate column names are dropped. INSERT INTOrows are capped at 5,000. WithoutINSERTstatements, the file is schema-only (mapping but no Search & Fill rows).
TXT schema file format
TXT files define schema columns only — one field per line. They do not contain record rows, so they support mapping but not Search & Fill.
Example TXT schema:
# Patient intake schema first_name:string last_name:string date_of_birth:date mrn:string is_active:bool age:int email
- Format:
field_nameorfield_name:type. Fields without a type default tostring. - Allowed types:
string,int,date,bool(case-insensitive). - Lines starting with
#are comments and ignored. - Blank lines are ignored.
- Duplicate field names are skipped.
- Whitespace is trimmed from both name and type.
Excel file format
Excel files (.xlsx, .xls) are read from the first sheet by default. The first row is treated as the header row, and all subsequent rows become searchable records.
- Same header deduplication and normalization rules as CSV.
- Empty rows are skipped. All values are stored as strings.
- Maximum 5,000 rows per import.
- Column types are inferred automatically from the data, same as CSV and JSON.
Fill flow
- If you published Fill By Link, open the respondent list for that saved template and select a saved submission.
- Choose a column (`Any column` is available) and match mode (`contains` or `equals`).
- Search is case-insensitive and returns at most 25 results per query.
- Click `Fill PDF` on a result row to write values to current fields.
- Date fields normalize accepted values like `YYYY-MM-DD` and `YYYY/MM/DD` to `YYYY-MM-DD`.
Guardrails
- If mapping is incomplete, fill coverage will be partial.
- Clear and refill when testing mapping revisions.
- Validate at least one full record before saving templates for teams.
- Search & Fill is enabled only for CSV/XLSX/JSON with at least one row, stored respondent records, and a loaded document. SQL and TXT are schema-only sources.
- Fill By Link submissions consume an account-level monthly quota instead of closing one link at a fixed per-link cap: base allows 25 accepted responses per month and premium allows 10,000.
Search & Fill versus Fill By Link versus API Fill
- Use Search & Fill when an operator is choosing one record inside the workspace.
- Use Fill By Link when the record still needs to be collected from a respondent first.
- Use API Fill when another system already has the record and needs a hosted JSON-to-PDF endpoint.
Field resolution heuristics (non-checkbox)
- Exact normalized name match is attempted first.
- Fallback prefixes: `patient_` and `responsible_party_` are checked automatically.
namefalls back to `full_name`, or `first_name + last_name`.ageis derived from `dob`/`date_of_birth` and reference `date`/`visit_date` (or current date).city_state_zipis composed from `city`, `state`, and `zip` when available.- Numeric suffix fields like `phone_1` fall back to base key `phone`.
- List fields (`allergy_1`, `medication_1`, `diagnosis_1`) can be sourced from comma/pipe/etc. lists.
Checkbox groups and aliases
Built-in alias fallbacks include groups like:
allergies- aliases `allergy`, `has_allergies`pregnant- aliases `pregnancy`, `pregnancy_status`, `is_pregnant`drug_use- aliases `substance_use`, `illicit_drug_use`, `has_drug_use`alcohol_use- aliases `drinks_alcohol`, `etoh_use`, `has_alcohol_use`tobacco_use- aliases `smoking`, `smoker`, `smoking_status`, `has_tobacco_use`
Why partial fills happen
- Some fields are still unmapped or mapped to unstable source headers.
- Date or checkbox values need normalization rules that the current row does not satisfy.
- The template was updated but the operator is still validating stale output without clearing and refilling.
- Alias fallbacks help, but they do not replace explicit mapping on important production templates.
