# Prism XML File Format & Schemas ## Overview GraphPad Prism uses XML-based file formats that allow external programs to read and modify data without using Prism's limited scripting language. ## File Formats ### PZFX - Prism XML Format - **Extension**: `.pzfx` - **Format**: Plain text XML - **Readable**: Yes - open in any text editor - **Data tables**: Fully accessible as XML - **Info tables**: Fully accessible as XML - **Analysis results**: Encrypted but readable as XML - **Graphs/settings**: Encrypted (not editable externally) - **Use case**: Primary format for external data access ### PZF - Prism Binary Format - **Extension**: `.pzf` - **Format**: Binary (not human-readable) - **Use case**: Smaller file size, faster loading - **External access**: None - must convert to PZFX first ### PZM - Prism Template/Master - **Extension**: `.pzm` - **Format**: Can be PZFX or PZF - **Use case**: Templates for repeated analysis - **Scripting**: Open with `Open "template.pzm"` ### PZC - Prism Script - **Extension**: `.pzc` - **Format**: Plain text script commands - **Use case**: Automation scripts ## Official XML Schemas GraphPad provides official XML schemas for Prism 7.0+ format. **Schema files included in this skill:** - `Prism7XMLSchema.xml` - Complete schema definition - `Prism7XMLStyleSheet.xml` - XSLT for transforming Prism XML **Schema location**: `../../prism-xml-schema/` **Schema covers**: - Data table structure - Info table structure - Formatting elements (fonts, colors, alignment) - Metadata (version, creation date, user info) - Text formatting (bold, italic, underline, super/subscript) ## PZFX File Structure ### Root Structure ```xml
``` ### Data Table Elements #### Basic Data Table ```xml Dose Response Data X Y 0.1 1.0 10.0 Control 10.5 12.3 11.8 Treatment 8.2 9.1 8.7
``` #### Table Attributes - `ID` - Unique identifier (e.g., "Data 1", "Table0") - `XFormat` - X column format: "none", "numbers", "text", "date" - `YFormat` - Y format: "replicates", "SD", "SEM", "mean" - `Replicates` - Number of subcolumns per Y column - `RowTitlesWidth` - Width of row titles column ### Info Table Elements ```xml Experiment Information Date 2024-01-15 Experimenter Jane Smith Concentration 10.5 Experiment notes go here. Multiple lines allowed. ``` ### Results Tables (Encrypted) ```xml ``` **Note**: Results are readable as XML but values are encrypted. You can see structure but not values without Prism. ### Data Types #### Numeric Values ```xml 10.5 1.23e-4 ``` #### Text Values ```xml Sample A Control ``` #### Formatted Text ```xml Bold text Italic text Underlined Superscript Subscript ``` #### Dates ```xml ExperimentDate 2024-01-15 ``` ## Parsing PZFX Files ### Python Example ```python import xml.etree.ElementTree as ET # Parse file tree = ET.parse('experiment.pzfx') root = tree.getroot() # Find all data tables for table in root.findall('.//Table'): table_id = table.get('ID') title = table.find('Title').text print(f"Table: {table_id} - {title}") # Get X values x_column = table.find('XColumn') if x_column is not None: x_values = [d.text for d in x_column.findall('d')] print(f" X values: {x_values}") # Get Y values for y_column in table.findall('YColumn'): y_title = y_column.find('Title').text subcolumn = y_column.find('Subcolumn') y_values = [d.text for d in subcolumn.findall('d')] print(f" {y_title}: {y_values}") ``` ### Reading Info Constants ```python # Find info tables for info in root.findall('.//Info'): info_id = info.get('ID') print(f"Info table: {info_id}") # Get all constants for constant in info.findall('Constant'): name = constant.find('Name').text value = constant.find('Value').text print(f" {name}: {value}") # Get notes notes = info.find('Notes') if notes is not None: print(f" Notes: {notes.text}") ``` ### Modifying Data ```python # Load template tree = ET.parse('template.pzfx') # Find first data table table = tree.find('.//Table[@ID="Data 1"]') # Modify X values x_column = table.find('XColumn') x_cells = x_column.findall('d') x_cells[0].text = "0.5" x_cells[1].text = "5.0" # Modify Y values y_column = table.find('YColumn') subcolumn = y_column.find('Subcolumn') y_cells = subcolumn.findall('d') y_cells[0].text = "15.2" # Save modified file tree.write('modified.pzfx', encoding='UTF-8', xml_declaration=True) ``` ## XPath Query Examples ### Find specific table ```python table = root.find(".//Table[@ID='Data 1']") ``` ### Find all Y columns ```python y_columns = root.findall(".//Table[@ID='Data 1']/YColumn") ``` ### Find specific info constant ```python date = root.find(".//Info/Constant[Name='Date']/Value").text ``` ### Count data points ```python num_points = len(root.findall(".//Table[@ID='Data 1']/XColumn/d")) ``` ## Validation ### Using xmllint (command line) ```bash # Validate against schema xmllint --schema Prism7XMLSchema.xml experiment.pzfx # Pretty print xmllint --format experiment.pzfx # Extract specific elements xmllint --xpath "//Table/@ID" experiment.pzfx ``` ### Python Validation ```python from lxml import etree # Load schema schema_doc = etree.parse('Prism7XMLSchema.xml') schema = etree.XMLSchema(schema_doc) # Validate file doc = etree.parse('experiment.pzfx') is_valid = schema.validate(doc) if not is_valid: print(schema.error_log) ``` ## Common Patterns ### Extract all data to CSV ```python import csv tree = ET.parse('experiment.pzfx') for table in tree.findall('.//Table'): table_id = table.get('ID') with open(f'{table_id}.csv', 'w', newline='') as csvfile: writer = csv.writer(csvfile) # Write column titles titles = [d.text for d in table.find('ColumnTitlesRow').findall('d')] writer.writerow(titles) # Write data rows x_column = table.find('XColumn') x_values = [d.text for d in x_column.findall('d')] for i, x_val in enumerate(x_values): row = [x_val] for y_column in table.findall('YColumn'): subcolumn = y_column.find('Subcolumn') y_values = [d.text for d in subcolumn.findall('d')] row.append(y_values[i] if i < len(y_values) else '') writer.writerow(row) ``` ### Batch update info constants ```python import glob for pzfx_file in glob.glob('*.pzfx'): tree = ET.parse(pzfx_file) # Find or create date constant info = tree.find('.//Info') date_constant = info.find("./Constant[Name='Date']") if date_constant is None: # Create new constant constant = ET.SubElement(info, 'Constant') ET.SubElement(constant, 'Name').text = 'Date' ET.SubElement(constant, 'Value').text = '2024-01-15' else: # Update existing date_constant.find('Value').text = '2024-01-15' tree.write(pzfx_file, encoding='UTF-8', xml_declaration=True) ``` ## Limitations **What you CAN access**: - ✅ All data values in tables - ✅ All info constants - ✅ Table structure and format - ✅ Column/row titles - ✅ File metadata **What you CANNOT access**: - ❌ Analysis parameter details (encrypted) - ❌ Calculated results values (encrypted) - ❌ Graph appearance settings (encrypted) - ❌ Analysis method details (encrypted) **Workaround**: Let Prism do the analysis. You modify data, Prism recalculates everything when opened. ## Best Practices 1. **Always preserve XML declaration**: Keep `` header 2. **Maintain structure**: Don't remove or reorder major elements 3. **Validate after modification**: Use xmllint or schema validation 4. **Backup originals**: Keep copy before programmatic modification 5. **Use proper encoding**: Save as UTF-8 with XML declaration 6. **Handle missing values**: Empty `` tags for missing data 7. **Test with Prism**: Open modified files in Prism to verify ## Resources - **Schema files**: `../../prism-xml-schema/Prism7XMLSchema.xml` - **Stylesheet**: `../../prism-xml-schema/Prism7XMLStyleSheet.xml` - **GraphPad documentation**: https://www.graphpad.com/ - **XML tools**: xmllint, Python lxml, R xml2 package