# Microscopy and Imaging File Formats Reference This reference covers file formats used in microscopy, medical imaging, remote sensing, and scientific image analysis. ## Microscopy-Specific Formats ### .tif / .tiff - Tagged Image File Format **Description:** Flexible image format supporting multiple pages and metadata **Typical Data:** Microscopy images, z-stacks, time series, multi-channel **Use Cases:** Fluorescence microscopy, confocal imaging, biological imaging **Python Libraries:** - `tifffile`: `tifffile.imread('file.tif')` - Microscopy TIFF support - `PIL/Pillow`: `Image.open('file.tif')` - Basic TIFF - `scikit-image`: `io.imread('file.tif')` - `AICSImageIO`: Multi-format microscopy reader **EDA Approach:** - Image dimensions and bit depth - Multi-page/z-stack analysis - Metadata extraction (OME-TIFF) - Channel analysis and intensity distributions - Temporal dynamics (time-lapse) - Pixel size and spatial calibration - Histogram analysis per channel - Dynamic range utilization ### .nd2 - Nikon NIS-Elements **Description:** Proprietary Nikon microscope format **Typical Data:** Multi-dimensional microscopy (XYZCT) **Use Cases:** Nikon microscope data, confocal, widefield **Python Libraries:** - `nd2reader`: `ND2Reader('file.nd2')` - `pims`: `pims.ND2_Reader('file.nd2')` - `AICSImageIO`: Universal reader **EDA Approach:** - Experiment metadata extraction - Channel configurations - Time-lapse frame analysis - Z-stack depth and spacing - XY stage positions - Laser settings and power - Pixel binning information - Acquisition timestamps ### .lif - Leica Image Format **Description:** Leica microscope proprietary format **Typical Data:** Multi-experiment, multi-dimensional images **Use Cases:** Leica confocal and widefield data **Python Libraries:** - `readlif`: `readlif.LifFile('file.lif')` - `AICSImageIO`: LIF support - `python-bioformats`: Via Bio-Formats **EDA Approach:** - Multiple experiment detection - Image series enumeration - Metadata per experiment - Channel and timepoint structure - Physical dimensions extraction - Objective and detector information - Scan settings analysis ### .czi - Carl Zeiss Image **Description:** Zeiss microscope format **Typical Data:** Multi-dimensional microscopy with rich metadata **Use Cases:** Zeiss confocal, lightsheet, widefield **Python Libraries:** - `czifile`: `czifile.CziFile('file.czi')` - `AICSImageIO`: CZI support - `pylibCZIrw`: Official Zeiss library **EDA Approach:** - Scene and position analysis - Mosaic tile structure - Channel wavelength information - Acquisition mode detection - Scaling and calibration - Instrument configuration - ROI definitions ### .oib / .oif - Olympus Image Format **Description:** Olympus microscope formats **Typical Data:** Confocal and multiphoton imaging **Use Cases:** Olympus FluoView data **Python Libraries:** - `AICSImageIO`: OIB/OIF support - `python-bioformats`: Via Bio-Formats **EDA Approach:** - Directory structure validation (OIF) - Metadata file parsing - Channel configuration - Scan parameters - Objective and filter information - PMT settings ### .vsi - Olympus VSI **Description:** Olympus slide scanner format **Typical Data:** Whole slide imaging, large mosaics **Use Cases:** Virtual microscopy, pathology **Python Libraries:** - `openslide-python`: `openslide.OpenSlide('file.vsi')` - `AICSImageIO`: VSI support **EDA Approach:** - Pyramid level analysis - Tile structure and overlap - Macro and label images - Magnification levels - Whole slide statistics - Region detection ### .ims - Imaris Format **Description:** Bitplane Imaris HDF5-based format **Typical Data:** Large 3D/4D microscopy datasets **Use Cases:** 3D rendering, time-lapse analysis **Python Libraries:** - `h5py`: Direct HDF5 access - `imaris_ims_file_reader`: Specialized reader **EDA Approach:** - Resolution level analysis - Time point structure - Channel organization - Dataset hierarchy - Thumbnail generation - Memory-mapped access strategies - Chunking optimization ### .lsm - Zeiss LSM **Description:** Legacy Zeiss confocal format **Typical Data:** Confocal laser scanning microscopy **Use Cases:** Older Zeiss confocal data **Python Libraries:** - `tifffile`: LSM support (TIFF-based) - `python-bioformats`: LSM reading **EDA Approach:** - Similar to TIFF with LSM-specific metadata - Scan speed and resolution - Laser lines and power - Detector gain and offset - LUT information ### .stk - MetaMorph Stack **Description:** MetaMorph image stack format **Typical Data:** Time-lapse or z-stack sequences **Use Cases:** MetaMorph software output **Python Libraries:** - `tifffile`: STK is TIFF-based - `python-bioformats`: STK support **EDA Approach:** - Stack dimensionality - Plane metadata - Timing information - Stage positions - UIC tags parsing ### .dv - DeltaVision **Description:** Applied Precision DeltaVision format **Typical Data:** Deconvolution microscopy **Use Cases:** DeltaVision microscope data **Python Libraries:** - `mrc`: Can read DV (MRC-related) - `AICSImageIO`: DV support **EDA Approach:** - Wave information (channels) - Extended header analysis - Lens and magnification - Deconvolution status - Time stamps per section ### .mrc - Medical Research Council **Description:** Electron microscopy format **Typical Data:** EM images, cryo-EM, tomography **Use Cases:** Structural biology, electron microscopy **Python Libraries:** - `mrcfile`: `mrcfile.open('file.mrc')` - `EMAN2`: EM-specific tools **EDA Approach:** - Volume dimensions - Voxel size and units - Origin and map statistics - Symmetry information - Extended header analysis - Density statistics - Header consistency validation ### .dm3 / .dm4 - Gatan Digital Micrograph **Description:** Gatan TEM/STEM format **Typical Data:** Transmission electron microscopy **Use Cases:** TEM imaging and analysis **Python Libraries:** - `hyperspy`: `hs.load('file.dm3')` - `ncempy`: `ncempy.io.dm.dmReader('file.dm3')` **EDA Approach:** - Microscope parameters - Energy dispersive spectroscopy data - Diffraction patterns - Calibration information - Tag structure analysis - Image series handling ### .eer - Electron Event Representation **Description:** Direct electron detector format **Typical Data:** Electron counting data from detectors **Use Cases:** Cryo-EM data collection **Python Libraries:** - `mrcfile`: Some EER support - Vendor-specific tools (Gatan, TFS) **EDA Approach:** - Event counting statistics - Frame rate and dose - Detector configuration - Motion correction assessment - Gain reference validation ### .ser - TIA Series **Description:** FEI/TFS TIA format **Typical Data:** EM image series **Use Cases:** FEI/Thermo Fisher EM data **Python Libraries:** - `hyperspy`: SER support - `ncempy`: TIA reader **EDA Approach:** - Series structure - Calibration data - Acquisition metadata - Time stamps - Multi-dimensional data organization ## Medical and Biological Imaging ### .dcm - DICOM **Description:** Digital Imaging and Communications in Medicine **Typical Data:** Medical images with patient/study metadata **Use Cases:** Clinical imaging, radiology, CT, MRI, PET **Python Libraries:** - `pydicom`: `pydicom.dcmread('file.dcm')` - `SimpleITK`: `sitk.ReadImage('file.dcm')` - `nibabel`: Limited DICOM support **EDA Approach:** - Patient metadata extraction (anonymization check) - Modality-specific analysis - Series and study organization - Slice thickness and spacing - Window/level settings - Hounsfield units (CT) - Image orientation and position - Multi-frame analysis ### .nii / .nii.gz - NIfTI **Description:** Neuroimaging Informatics Technology Initiative **Typical Data:** Brain imaging, fMRI, structural MRI **Use Cases:** Neuroimaging research, brain analysis **Python Libraries:** - `nibabel`: `nibabel.load('file.nii')` - `nilearn`: Neuroimaging with ML - `SimpleITK`: NIfTI support **EDA Approach:** - Volume dimensions and voxel size - Affine transformation matrix - Time series analysis (fMRI) - Intensity distribution - Brain extraction quality - Registration assessment - Orientation validation - Header information consistency ### .mnc - MINC Format **Description:** Medical Image NetCDF **Typical Data:** Medical imaging (predecessor to NIfTI) **Use Cases:** Legacy neuroimaging data **Python Libraries:** - `pyminc`: MINC-specific tools - `nibabel`: MINC support **EDA Approach:** - Similar to NIfTI - NetCDF structure exploration - Dimension ordering - Metadata extraction ### .nrrd - Nearly Raw Raster Data **Description:** Medical imaging format with detached header **Typical Data:** Medical images, research imaging **Use Cases:** 3D Slicer, ITK-based applications **Python Libraries:** - `pynrrd`: `nrrd.read('file.nrrd')` - `SimpleITK`: NRRD support **EDA Approach:** - Header field analysis - Encoding format - Dimension and spacing - Orientation matrix - Compression assessment - Endianness handling ### .mha / .mhd - MetaImage **Description:** MetaImage format (ITK) **Typical Data:** Medical/scientific 3D images **Use Cases:** ITK/SimpleITK applications **Python Libraries:** - `SimpleITK`: Native MHA/MHD support - `itk`: Direct ITK integration **EDA Approach:** - Header-data file pairing (MHD) - Transform matrix - Element spacing - Compression format - Data type and dimensions ### .hdr / .img - Analyze Format **Description:** Legacy medical imaging format **Typical Data:** Brain imaging (pre-NIfTI) **Use Cases:** Old neuroimaging datasets **Python Libraries:** - `nibabel`: Analyze support - Conversion to NIfTI recommended **EDA Approach:** - Header-image pairing validation - Byte order issues - Conversion to modern formats - Metadata limitations ## Scientific Image Formats ### .png - Portable Network Graphics **Description:** Lossless compressed image format **Typical Data:** 2D images, screenshots, processed data **Use Cases:** Publication figures, lossless storage **Python Libraries:** - `PIL/Pillow`: `Image.open('file.png')` - `scikit-image`: `io.imread('file.png')` - `imageio`: `imageio.imread('file.png')` **EDA Approach:** - Bit depth analysis (8-bit, 16-bit) - Color mode (grayscale, RGB, palette) - Metadata (PNG chunks) - Transparency handling - Compression efficiency - Histogram analysis ### .jpg / .jpeg - Joint Photographic Experts Group **Description:** Lossy compressed image format **Typical Data:** Natural images, photos **Use Cases:** Visualization, web graphics (not raw data) **Python Libraries:** - `PIL/Pillow`: Standard JPEG support - `scikit-image`: JPEG reading **EDA Approach:** - Compression artifacts detection - Quality factor estimation - Color space (RGB, grayscale) - EXIF metadata - Quantization table analysis - Note: Not suitable for quantitative analysis ### .bmp - Bitmap Image **Description:** Uncompressed raster image **Typical Data:** Simple images, screenshots **Use Cases:** Compatibility, simple storage **Python Libraries:** - `PIL/Pillow`: BMP support - `scikit-image`: BMP reading **EDA Approach:** - Color depth - Palette analysis (if indexed) - File size efficiency - Pixel format validation ### .gif - Graphics Interchange Format **Description:** Image format with animation support **Typical Data:** Animated images, simple graphics **Use Cases:** Animations, time-lapse visualization **Python Libraries:** - `PIL/Pillow`: GIF support - `imageio`: Better GIF animation support **EDA Approach:** - Frame count and timing - Palette limitations (256 colors) - Loop count - Disposal method - Transparency handling ### .svg - Scalable Vector Graphics **Description:** XML-based vector graphics **Typical Data:** Vector drawings, plots, diagrams **Use Cases:** Publication-quality figures, plots **Python Libraries:** - `svgpathtools`: Path manipulation - `cairosvg`: Rasterization - `lxml`: XML parsing **EDA Approach:** - Element structure analysis - Style information - Viewbox and dimensions - Path complexity - Text element extraction - Layer organization ### .eps - Encapsulated PostScript **Description:** Vector graphics format **Typical Data:** Publication figures **Use Cases:** Legacy publication graphics **Python Libraries:** - `PIL/Pillow`: Basic EPS rasterization - `ghostscript` via subprocess **EDA Approach:** - Bounding box information - Preview image validation - Font embedding - Conversion to modern formats ### .pdf (Images) **Description:** Portable Document Format with images **Typical Data:** Publication figures, multi-page documents **Use Cases:** Publication, data presentation **Python Libraries:** - `PyMuPDF/fitz`: `fitz.open('file.pdf')` - `pdf2image`: Rasterization - `pdfplumber`: Text and layout extraction **EDA Approach:** - Page count - Image extraction - Resolution and DPI - Embedded fonts and metadata - Compression methods - Image vs vector content ### .fig - MATLAB Figure **Description:** MATLAB figure file **Typical Data:** MATLAB plots and figures **Use Cases:** MATLAB data visualization **Python Libraries:** - Custom parsers (MAT file structure) - Conversion to other formats **EDA Approach:** - Figure structure - Data extraction from plots - Axes and label information - Plot type identification ### .hdf5 (Imaging Specific) **Description:** HDF5 for large imaging datasets **Typical Data:** High-content screening, large microscopy **Use Cases:** BigDataViewer, large-scale imaging **Python Libraries:** - `h5py`: Universal HDF5 access - Imaging-specific readers (BigDataViewer) **EDA Approach:** - Dataset hierarchy - Chunk and compression strategy - Multi-resolution pyramid - Metadata organization - Memory-mapped access - Parallel I/O performance ### .zarr - Chunked Array Storage **Description:** Cloud-optimized array storage **Typical Data:** Large imaging datasets, OME-ZARR **Use Cases:** Cloud microscopy, large-scale analysis **Python Libraries:** - `zarr`: `zarr.open('file.zarr')` - `ome-zarr-py`: OME-ZARR support **EDA Approach:** - Chunk size optimization - Compression codec analysis - Multi-scale representation - Array dimensions and dtype - Metadata structure (OME) - Cloud access patterns ### .raw - Raw Image Data **Description:** Unformatted binary pixel data **Typical Data:** Raw detector output **Use Cases:** Custom imaging systems **Python Libraries:** - `numpy`: `np.fromfile()` with dtype - `imageio`: Raw format plugins **EDA Approach:** - Dimensions determination (external info needed) - Byte order and data type - Header presence detection - Pixel value range - Noise characteristics ### .bin - Binary Image Data **Description:** Generic binary image format **Typical Data:** Raw or custom-formatted images **Use Cases:** Instrument-specific outputs **Python Libraries:** - `numpy`: Custom binary reading - `struct`: For structured binary data **EDA Approach:** - Format specification required - Header parsing (if present) - Data type inference - Dimension extraction - Validation with known parameters ## Image Analysis Formats ### .roi - ImageJ ROI **Description:** ImageJ region of interest format **Typical Data:** Geometric ROIs, selections **Use Cases:** ImageJ/Fiji analysis workflows **Python Libraries:** - `read-roi`: `read_roi.read_roi_file('file.roi')` - `roifile`: ROI manipulation **EDA Approach:** - ROI type analysis (rectangle, polygon, etc.) - Coordinate extraction - ROI properties (area, perimeter) - Group analysis (ROI sets) - Z-position and time information ### .zip (ROI sets) **Description:** ZIP archive of ImageJ ROIs **Typical Data:** Multiple ROI files **Use Cases:** Batch ROI analysis **Python Libraries:** - `read-roi`: `read_roi.read_roi_zip('file.zip')` - Standard `zipfile` module **EDA Approach:** - ROI count in set - ROI type distribution - Spatial distribution - Overlapping ROI detection - Naming conventions ### .ome.tif / .ome.tiff - OME-TIFF **Description:** TIFF with OME-XML metadata **Typical Data:** Standardized microscopy with rich metadata **Use Cases:** Bio-Formats compatible storage **Python Libraries:** - `tifffile`: OME-TIFF support - `AICSImageIO`: OME reading - `python-bioformats`: Bio-Formats integration **EDA Approach:** - OME-XML validation - Physical dimensions extraction - Channel naming and wavelengths - Plane positions (Z, C, T) - Instrument metadata - Bio-Formats compatibility ### .ome.zarr - OME-ZARR **Description:** OME-NGFF specification on ZARR **Typical Data:** Next-generation file format for bioimaging **Use Cases:** Cloud-native imaging, large datasets **Python Libraries:** - `ome-zarr-py`: Official implementation - `zarr`: Underlying array storage **EDA Approach:** - Multiscale resolution levels - Metadata compliance with OME-NGFF spec - Coordinate transformations - Label and ROI handling - Cloud storage optimization - Chunk access patterns ### .klb - Keller Lab Block **Description:** Fast microscopy format for large data **Typical Data:** Lightsheet microscopy, time-lapse **Use Cases:** High-throughput imaging **Python Libraries:** - `pyklb`: KLB reading and writing **EDA Approach:** - Compression efficiency - Block structure - Multi-resolution support - Read performance benchmarking - Metadata extraction ### .vsi - Whole Slide Imaging **Description:** Virtual slide format (multiple vendors) **Typical Data:** Pathology slides, large mosaics **Use Cases:** Digital pathology **Python Libraries:** - `openslide-python`: Multi-format WSI - `tiffslide`: Pure Python alternative **EDA Approach:** - Pyramid level count - Downsampling factors - Associated images (macro, label) - Tile size and overlap - MPP (microns per pixel) - Background detection - Tissue segmentation ### .ndpi - Hamamatsu NanoZoomer **Description:** Hamamatsu slide scanner format **Typical Data:** Whole slide pathology images **Use Cases:** Digital pathology workflows **Python Libraries:** - `openslide-python`: NDPI support **EDA Approach:** - Multi-resolution pyramid - Lens and objective information - Scan area and magnification - Focal plane information - Tissue detection ### .svs - Aperio ScanScope **Description:** Aperio whole slide format **Typical Data:** Digital pathology slides **Use Cases:** Pathology image analysis **Python Libraries:** - `openslide-python`: SVS support **EDA Approach:** - Pyramid structure - MPP calibration - Label and macro images - Compression quality - Thumbnail generation ### .scn - Leica SCN **Description:** Leica slide scanner format **Typical Data:** Whole slide imaging **Use Cases:** Digital pathology **Python Libraries:** - `openslide-python`: SCN support **EDA Approach:** - Tile structure analysis - Collection organization - Metadata extraction - Magnification levels