419 lines
10 KiB
Markdown
419 lines
10 KiB
Markdown
# Prism XML File Format & Schemas
|
|
|
|
## Overview
|
|
|
|
GraphPad Prism uses XML-based file formats that allow external programs to read and modify data without using Prism's limited scripting language.
|
|
|
|
## File Formats
|
|
|
|
### PZFX - Prism XML Format
|
|
- **Extension**: `.pzfx`
|
|
- **Format**: Plain text XML
|
|
- **Readable**: Yes - open in any text editor
|
|
- **Data tables**: Fully accessible as XML
|
|
- **Info tables**: Fully accessible as XML
|
|
- **Analysis results**: Encrypted but readable as XML
|
|
- **Graphs/settings**: Encrypted (not editable externally)
|
|
- **Use case**: Primary format for external data access
|
|
|
|
### PZF - Prism Binary Format
|
|
- **Extension**: `.pzf`
|
|
- **Format**: Binary (not human-readable)
|
|
- **Use case**: Smaller file size, faster loading
|
|
- **External access**: None - must convert to PZFX first
|
|
|
|
### PZM - Prism Template/Master
|
|
- **Extension**: `.pzm`
|
|
- **Format**: Can be PZFX or PZF
|
|
- **Use case**: Templates for repeated analysis
|
|
- **Scripting**: Open with `Open "template.pzm"`
|
|
|
|
### PZC - Prism Script
|
|
- **Extension**: `.pzc`
|
|
- **Format**: Plain text script commands
|
|
- **Use case**: Automation scripts
|
|
|
|
## Official XML Schemas
|
|
|
|
GraphPad provides official XML schemas for Prism 7.0+ format.
|
|
|
|
**Schema files included in this skill:**
|
|
- `Prism7XMLSchema.xml` - Complete schema definition
|
|
- `Prism7XMLStyleSheet.xml` - XSLT for transforming Prism XML
|
|
|
|
**Schema location**: `../../prism-xml-schema/`
|
|
|
|
**Schema covers**:
|
|
- Data table structure
|
|
- Info table structure
|
|
- Formatting elements (fonts, colors, alignment)
|
|
- Metadata (version, creation date, user info)
|
|
- Text formatting (bold, italic, underline, super/subscript)
|
|
|
|
## PZFX File Structure
|
|
|
|
### Root Structure
|
|
```xml
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<GraphPadPrismFile>
|
|
<OriginalVersion CreatedByProgram="Prism" CreatedByVersion="10.0.0" />
|
|
<TableSequence>
|
|
<Ref ID="Table0" />
|
|
<Ref ID="Table1" />
|
|
</TableSequence>
|
|
<Table ID="Table0" ...>
|
|
<!-- Data table content -->
|
|
</Table>
|
|
<InfoSequence>
|
|
<Ref ID="Info0" />
|
|
</InfoSequence>
|
|
<Info ID="Info0" ...>
|
|
<!-- Info constants -->
|
|
</Info>
|
|
</GraphPadPrismFile>
|
|
```
|
|
|
|
### Data Table Elements
|
|
|
|
#### Basic Data Table
|
|
```xml
|
|
<Table ID="Data 1" XFormat="none" YFormat="replicates" Replicates="3">
|
|
<Title>Dose Response Data</Title>
|
|
|
|
<!-- Column titles -->
|
|
<ColumnTitlesRow>
|
|
<d>X</d>
|
|
<d>Y</d>
|
|
</ColumnTitlesRow>
|
|
|
|
<!-- X Column -->
|
|
<XColumn Width="81">
|
|
<d>0.1</d>
|
|
<d>1.0</d>
|
|
<d>10.0</d>
|
|
</XColumn>
|
|
|
|
<!-- Y Column with replicates -->
|
|
<YColumn Width="81">
|
|
<Title>Control</Title>
|
|
<Subcolumn>
|
|
<d>10.5</d>
|
|
<d>12.3</d>
|
|
<d>11.8</d>
|
|
</Subcolumn>
|
|
</YColumn>
|
|
|
|
<YColumn Width="81">
|
|
<Title>Treatment</Title>
|
|
<Subcolumn>
|
|
<d>8.2</d>
|
|
<d>9.1</d>
|
|
<d>8.7</d>
|
|
</Subcolumn>
|
|
</YColumn>
|
|
</Table>
|
|
```
|
|
|
|
#### Table Attributes
|
|
- `ID` - Unique identifier (e.g., "Data 1", "Table0")
|
|
- `XFormat` - X column format: "none", "numbers", "text", "date"
|
|
- `YFormat` - Y format: "replicates", "SD", "SEM", "mean"
|
|
- `Replicates` - Number of subcolumns per Y column
|
|
- `RowTitlesWidth` - Width of row titles column
|
|
|
|
### Info Table Elements
|
|
|
|
```xml
|
|
<Info ID="Info0">
|
|
<Title>Experiment Information</Title>
|
|
|
|
<!-- Individual constants -->
|
|
<Constant>
|
|
<Name>Date</Name>
|
|
<Value>2024-01-15</Value>
|
|
</Constant>
|
|
|
|
<Constant>
|
|
<Name>Experimenter</Name>
|
|
<Value>Jane Smith</Value>
|
|
</Constant>
|
|
|
|
<Constant>
|
|
<Name>Concentration</Name>
|
|
<Value>10.5</Value>
|
|
</Constant>
|
|
|
|
<!-- Notes section -->
|
|
<Notes>
|
|
Experiment notes go here.
|
|
Multiple lines allowed.
|
|
</Notes>
|
|
</Info>
|
|
```
|
|
|
|
### Results Tables (Encrypted)
|
|
|
|
```xml
|
|
<HugeResults ID="HugeResults0">
|
|
<!-- Results are encrypted but structure is visible -->
|
|
<OriginalVersion ... />
|
|
<AnalysisParams ...>
|
|
<!-- Some parameters visible -->
|
|
</AnalysisParams>
|
|
<!-- Actual results encrypted -->
|
|
</HugeResults>
|
|
```
|
|
|
|
**Note**: Results are readable as XML but values are encrypted. You can see structure but not values without Prism.
|
|
|
|
### Data Types
|
|
|
|
#### Numeric Values
|
|
```xml
|
|
<d>10.5</d> <!-- Regular number -->
|
|
<d>1.23e-4</d> <!-- Scientific notation -->
|
|
<d></d> <!-- Missing/empty value -->
|
|
```
|
|
|
|
#### Text Values
|
|
```xml
|
|
<d>Sample A</d> <!-- Plain text -->
|
|
<d>Control</d> <!-- Text in row/column titles -->
|
|
```
|
|
|
|
#### Formatted Text
|
|
```xml
|
|
<d>
|
|
<b>Bold text</b>
|
|
<i>Italic text</i>
|
|
<u>Underlined</u>
|
|
<sup>Superscript</sup>
|
|
<sub>Subscript</sub>
|
|
</d>
|
|
```
|
|
|
|
#### Dates
|
|
```xml
|
|
<Constant>
|
|
<Name>ExperimentDate</Name>
|
|
<Value>2024-01-15</Value>
|
|
</Constant>
|
|
```
|
|
|
|
## Parsing PZFX Files
|
|
|
|
### Python Example
|
|
|
|
```python
|
|
import xml.etree.ElementTree as ET
|
|
|
|
# Parse file
|
|
tree = ET.parse('experiment.pzfx')
|
|
root = tree.getroot()
|
|
|
|
# Find all data tables
|
|
for table in root.findall('.//Table'):
|
|
table_id = table.get('ID')
|
|
title = table.find('Title').text
|
|
|
|
print(f"Table: {table_id} - {title}")
|
|
|
|
# Get X values
|
|
x_column = table.find('XColumn')
|
|
if x_column is not None:
|
|
x_values = [d.text for d in x_column.findall('d')]
|
|
print(f" X values: {x_values}")
|
|
|
|
# Get Y values
|
|
for y_column in table.findall('YColumn'):
|
|
y_title = y_column.find('Title').text
|
|
subcolumn = y_column.find('Subcolumn')
|
|
y_values = [d.text for d in subcolumn.findall('d')]
|
|
print(f" {y_title}: {y_values}")
|
|
```
|
|
|
|
### Reading Info Constants
|
|
|
|
```python
|
|
# Find info tables
|
|
for info in root.findall('.//Info'):
|
|
info_id = info.get('ID')
|
|
print(f"Info table: {info_id}")
|
|
|
|
# Get all constants
|
|
for constant in info.findall('Constant'):
|
|
name = constant.find('Name').text
|
|
value = constant.find('Value').text
|
|
print(f" {name}: {value}")
|
|
|
|
# Get notes
|
|
notes = info.find('Notes')
|
|
if notes is not None:
|
|
print(f" Notes: {notes.text}")
|
|
```
|
|
|
|
### Modifying Data
|
|
|
|
```python
|
|
# Load template
|
|
tree = ET.parse('template.pzfx')
|
|
|
|
# Find first data table
|
|
table = tree.find('.//Table[@ID="Data 1"]')
|
|
|
|
# Modify X values
|
|
x_column = table.find('XColumn')
|
|
x_cells = x_column.findall('d')
|
|
x_cells[0].text = "0.5"
|
|
x_cells[1].text = "5.0"
|
|
|
|
# Modify Y values
|
|
y_column = table.find('YColumn')
|
|
subcolumn = y_column.find('Subcolumn')
|
|
y_cells = subcolumn.findall('d')
|
|
y_cells[0].text = "15.2"
|
|
|
|
# Save modified file
|
|
tree.write('modified.pzfx', encoding='UTF-8', xml_declaration=True)
|
|
```
|
|
|
|
## XPath Query Examples
|
|
|
|
### Find specific table
|
|
```python
|
|
table = root.find(".//Table[@ID='Data 1']")
|
|
```
|
|
|
|
### Find all Y columns
|
|
```python
|
|
y_columns = root.findall(".//Table[@ID='Data 1']/YColumn")
|
|
```
|
|
|
|
### Find specific info constant
|
|
```python
|
|
date = root.find(".//Info/Constant[Name='Date']/Value").text
|
|
```
|
|
|
|
### Count data points
|
|
```python
|
|
num_points = len(root.findall(".//Table[@ID='Data 1']/XColumn/d"))
|
|
```
|
|
|
|
## Validation
|
|
|
|
### Using xmllint (command line)
|
|
```bash
|
|
# Validate against schema
|
|
xmllint --schema Prism7XMLSchema.xml experiment.pzfx
|
|
|
|
# Pretty print
|
|
xmllint --format experiment.pzfx
|
|
|
|
# Extract specific elements
|
|
xmllint --xpath "//Table/@ID" experiment.pzfx
|
|
```
|
|
|
|
### Python Validation
|
|
```python
|
|
from lxml import etree
|
|
|
|
# Load schema
|
|
schema_doc = etree.parse('Prism7XMLSchema.xml')
|
|
schema = etree.XMLSchema(schema_doc)
|
|
|
|
# Validate file
|
|
doc = etree.parse('experiment.pzfx')
|
|
is_valid = schema.validate(doc)
|
|
|
|
if not is_valid:
|
|
print(schema.error_log)
|
|
```
|
|
|
|
## Common Patterns
|
|
|
|
### Extract all data to CSV
|
|
```python
|
|
import csv
|
|
|
|
tree = ET.parse('experiment.pzfx')
|
|
|
|
for table in tree.findall('.//Table'):
|
|
table_id = table.get('ID')
|
|
|
|
with open(f'{table_id}.csv', 'w', newline='') as csvfile:
|
|
writer = csv.writer(csvfile)
|
|
|
|
# Write column titles
|
|
titles = [d.text for d in table.find('ColumnTitlesRow').findall('d')]
|
|
writer.writerow(titles)
|
|
|
|
# Write data rows
|
|
x_column = table.find('XColumn')
|
|
x_values = [d.text for d in x_column.findall('d')]
|
|
|
|
for i, x_val in enumerate(x_values):
|
|
row = [x_val]
|
|
for y_column in table.findall('YColumn'):
|
|
subcolumn = y_column.find('Subcolumn')
|
|
y_values = [d.text for d in subcolumn.findall('d')]
|
|
row.append(y_values[i] if i < len(y_values) else '')
|
|
writer.writerow(row)
|
|
```
|
|
|
|
### Batch update info constants
|
|
```python
|
|
import glob
|
|
|
|
for pzfx_file in glob.glob('*.pzfx'):
|
|
tree = ET.parse(pzfx_file)
|
|
|
|
# Find or create date constant
|
|
info = tree.find('.//Info')
|
|
date_constant = info.find("./Constant[Name='Date']")
|
|
|
|
if date_constant is None:
|
|
# Create new constant
|
|
constant = ET.SubElement(info, 'Constant')
|
|
ET.SubElement(constant, 'Name').text = 'Date'
|
|
ET.SubElement(constant, 'Value').text = '2024-01-15'
|
|
else:
|
|
# Update existing
|
|
date_constant.find('Value').text = '2024-01-15'
|
|
|
|
tree.write(pzfx_file, encoding='UTF-8', xml_declaration=True)
|
|
```
|
|
|
|
## Limitations
|
|
|
|
**What you CAN access**:
|
|
- ✅ All data values in tables
|
|
- ✅ All info constants
|
|
- ✅ Table structure and format
|
|
- ✅ Column/row titles
|
|
- ✅ File metadata
|
|
|
|
**What you CANNOT access**:
|
|
- ❌ Analysis parameter details (encrypted)
|
|
- ❌ Calculated results values (encrypted)
|
|
- ❌ Graph appearance settings (encrypted)
|
|
- ❌ Analysis method details (encrypted)
|
|
|
|
**Workaround**: Let Prism do the analysis. You modify data, Prism recalculates everything when opened.
|
|
|
|
## Best Practices
|
|
|
|
1. **Always preserve XML declaration**: Keep `<?xml version="1.0"?>` header
|
|
2. **Maintain structure**: Don't remove or reorder major elements
|
|
3. **Validate after modification**: Use xmllint or schema validation
|
|
4. **Backup originals**: Keep copy before programmatic modification
|
|
5. **Use proper encoding**: Save as UTF-8 with XML declaration
|
|
6. **Handle missing values**: Empty `<d></d>` tags for missing data
|
|
7. **Test with Prism**: Open modified files in Prism to verify
|
|
|
|
## Resources
|
|
|
|
- **Schema files**: `../../prism-xml-schema/Prism7XMLSchema.xml`
|
|
- **Stylesheet**: `../../prism-xml-schema/Prism7XMLStyleSheet.xml`
|
|
- **GraphPad documentation**: https://www.graphpad.com/
|
|
- **XML tools**: xmllint, Python lxml, R xml2 package
|