# FITS File Handling (astropy.io.fits) The `astropy.io.fits` module provides comprehensive tools for reading, writing, and manipulating FITS (Flexible Image Transport System) files. ## Opening FITS Files ### Basic File Opening ```python from astropy.io import fits # Open file (returns HDUList - list of HDUs) hdul = fits.open('filename.fits') # Always close when done hdul.close() # Better: use context manager (automatically closes) with fits.open('filename.fits') as hdul: hdul.info() # Display file structure data = hdul[0].data ``` ### File Opening Modes ```python fits.open('file.fits', mode='readonly') # Read-only (default) fits.open('file.fits', mode='update') # Read and write fits.open('file.fits', mode='append') # Add HDUs to file ``` ### Memory Mapping For large files, use memory mapping (default behavior): ```python hdul = fits.open('large_file.fits', memmap=True) # Only loads data chunks as needed ``` ### Remote Files Access cloud-hosted FITS files: ```python uri = "s3://bucket-name/image.fits" with fits.open(uri, use_fsspec=True, fsspec_kwargs={"anon": True}) as hdul: # Use .section to get cutouts without downloading entire file cutout = hdul[1].section[100:200, 100:200] ``` ## HDU Structure FITS files contain Header Data Units (HDUs): - **Primary HDU** (`hdul[0]`): First HDU, always present - **Extension HDUs** (`hdul[1:]`): Image or table extensions ```python hdul.info() # Display all HDUs # Output: # No. Name Ver Type Cards Dimensions Format # 0 PRIMARY 1 PrimaryHDU 220 () # 1 SCI 1 ImageHDU 140 (1014, 1014) float32 # 2 ERR 1 ImageHDU 51 (1014, 1014) float32 ``` ## Accessing HDUs ```python # By index primary = hdul[0] extension1 = hdul[1] # By name sci = hdul['SCI'] # By name and version number sci2 = hdul['SCI', 2] # Second SCI extension ``` ## Working with Headers ### Reading Header Values ```python hdu = hdul[0] header = hdu.header # Get keyword value (case-insensitive) observer = header['OBSERVER'] exptime = header['EXPTIME'] # Get with default if missing filter_name = header.get('FILTER', 'Unknown') # Access by index value = header[7] # 8th card's value ``` ### Modifying Headers ```python # Update existing keyword header['OBSERVER'] = 'Edwin Hubble' # Add/update with comment header['OBSERVER'] = ('Edwin Hubble', 'Name of observer') # Add keyword at specific position header.insert(5, ('NEWKEY', 'value', 'comment')) # Add HISTORY and COMMENT header['HISTORY'] = 'File processed on 2025-01-15' header['COMMENT'] = 'Note about the data' # Delete keyword del header['OLDKEY'] ``` ### Header Cards Each keyword is stored as a "card" (80-character record): ```python # Access full card card = header.cards[0] print(f"{card.keyword} = {card.value} / {card.comment}") # Iterate over all cards for card in header.cards: print(f"{card.keyword}: {card.value}") ``` ## Working with Image Data ### Reading Image Data ```python # Get data from HDU data = hdul[1].data # Returns NumPy array # Data properties print(data.shape) # e.g., (1024, 1024) print(data.dtype) # e.g., float32 print(data.min(), data.max()) # Access specific pixels pixel_value = data[100, 200] region = data[100:200, 300:400] ``` ### Data Operations Data is a NumPy array, so use standard NumPy operations: ```python import numpy as np # Statistics mean = np.mean(data) median = np.median(data) std = np.std(data) # Modify data data[data < 0] = 0 # Clip negative values data = data * gain + bias # Calibration # Mathematical operations log_data = np.log10(data) smoothed = scipy.ndimage.gaussian_filter(data, sigma=2) ``` ### Cutouts and Sections Extract regions without loading entire array: ```python # Section notation [y_start:y_end, x_start:x_end] cutout = hdul[1].section[500:600, 700:800] ``` ## Creating New FITS Files ### Simple Image File ```python # Create data data = np.random.random((100, 100)) # Create HDU hdu = fits.PrimaryHDU(data=data) # Add header keywords hdu.header['OBJECT'] = 'Test Image' hdu.header['EXPTIME'] = 300.0 # Write to file hdu.writeto('new_image.fits') # Overwrite if exists hdu.writeto('new_image.fits', overwrite=True) ``` ### Multi-Extension File ```python # Create primary HDU (can have no data) primary = fits.PrimaryHDU() primary.header['TELESCOP'] = 'HST' # Create image extensions sci_data = np.ones((100, 100)) sci = fits.ImageHDU(data=sci_data, name='SCI') err_data = np.ones((100, 100)) * 0.1 err = fits.ImageHDU(data=err_data, name='ERR') # Combine into HDUList hdul = fits.HDUList([primary, sci, err]) # Write to file hdul.writeto('multi_extension.fits') ``` ## Working with Table Data ### Reading Tables ```python # Open table with fits.open('table.fits') as hdul: table = hdul[1].data # BinTableHDU or TableHDU # Access columns ra = table['RA'] dec = table['DEC'] mag = table['MAG'] # Access rows first_row = table[0] subset = table[10:20] # Column info cols = hdul[1].columns print(cols.names) cols.info() ``` ### Creating Tables ```python # Define columns col1 = fits.Column(name='ID', format='K', array=[1, 2, 3, 4]) col2 = fits.Column(name='RA', format='D', array=[10.5, 11.2, 12.3, 13.1]) col3 = fits.Column(name='DEC', format='D', array=[41.2, 42.1, 43.5, 44.2]) col4 = fits.Column(name='Name', format='20A', array=['Star1', 'Star2', 'Star3', 'Star4']) # Create table HDU table_hdu = fits.BinTableHDU.from_columns([col1, col2, col3, col4]) table_hdu.name = 'CATALOG' # Write to file table_hdu.writeto('catalog.fits', overwrite=True) ``` ### Column Formats Common FITS table column formats: - `'A'`: Character string (e.g., '20A' for 20 characters) - `'L'`: Logical (boolean) - `'B'`: Unsigned byte - `'I'`: 16-bit integer - `'J'`: 32-bit integer - `'K'`: 64-bit integer - `'E'`: 32-bit floating point - `'D'`: 64-bit floating point ## Modifying Existing Files ### Update Mode ```python with fits.open('file.fits', mode='update') as hdul: # Modify header hdul[0].header['NEWKEY'] = 'value' # Modify data hdul[1].data[100, 100] = 999 # Changes automatically saved when context exits ``` ### Append Mode ```python # Add new extension to existing file new_data = np.random.random((50, 50)) new_hdu = fits.ImageHDU(data=new_data, name='NEW_EXT') with fits.open('file.fits', mode='append') as hdul: hdul.append(new_hdu) ``` ## Convenience Functions For quick operations without managing HDU lists: ```python # Get data only data = fits.getdata('file.fits', ext=1) # Get header only header = fits.getheader('file.fits', ext=0) # Get both data, header = fits.getdata('file.fits', ext=1, header=True) # Get single keyword value exptime = fits.getval('file.fits', 'EXPTIME', ext=0) # Set keyword value fits.setval('file.fits', 'NEWKEY', value='newvalue', ext=0) # Write simple file fits.writeto('output.fits', data, header, overwrite=True) # Append to file fits.append('file.fits', data, header) # Display file info fits.info('file.fits') ``` ## Comparing FITS Files ```python # Print differences between two files fits.printdiff('file1.fits', 'file2.fits') # Compare programmatically diff = fits.FITSDiff('file1.fits', 'file2.fits') print(diff.report()) ``` ## Converting Between Formats ### FITS to/from Astropy Table ```python from astropy.table import Table # FITS to Table table = Table.read('catalog.fits') # Table to FITS table.write('output.fits', format='fits', overwrite=True) ``` ## Best Practices 1. **Always use context managers** (`with` statements) for safe file handling 2. **Avoid modifying structural keywords** (SIMPLE, BITPIX, NAXIS, etc.) 3. **Use memory mapping** for large files to conserve RAM 4. **Use .section** for remote files to avoid full downloads 5. **Check HDU structure** with `.info()` before accessing data 6. **Verify data types** before operations to avoid unexpected behavior 7. **Use convenience functions** for simple one-off operations ## Common Issues ### Handling Non-Standard FITS Some files violate FITS standards: ```python # Ignore verification warnings hdul = fits.open('bad_file.fits', ignore_missing_end=True) # Fix non-standard files hdul = fits.open('bad_file.fits') hdul.verify('fix') # Try to fix issues hdul.writeto('fixed_file.fits') ``` ### Large File Performance ```python # Use memory mapping (default) hdul = fits.open('huge_file.fits', memmap=True) # For write operations with large arrays, use Dask import dask.array as da large_array = da.random.random((10000, 10000)) fits.writeto('output.fits', large_array) ```