8.5 KiB
8.5 KiB
FITS File Handling (astropy.io.fits)
The astropy.io.fits module provides comprehensive tools for reading, writing, and manipulating FITS (Flexible Image Transport System) files.
Opening FITS Files
Basic File Opening
from astropy.io import fits
# Open file (returns HDUList - list of HDUs)
hdul = fits.open('filename.fits')
# Always close when done
hdul.close()
# Better: use context manager (automatically closes)
with fits.open('filename.fits') as hdul:
hdul.info() # Display file structure
data = hdul[0].data
File Opening Modes
fits.open('file.fits', mode='readonly') # Read-only (default)
fits.open('file.fits', mode='update') # Read and write
fits.open('file.fits', mode='append') # Add HDUs to file
Memory Mapping
For large files, use memory mapping (default behavior):
hdul = fits.open('large_file.fits', memmap=True)
# Only loads data chunks as needed
Remote Files
Access cloud-hosted FITS files:
uri = "s3://bucket-name/image.fits"
with fits.open(uri, use_fsspec=True, fsspec_kwargs={"anon": True}) as hdul:
# Use .section to get cutouts without downloading entire file
cutout = hdul[1].section[100:200, 100:200]
HDU Structure
FITS files contain Header Data Units (HDUs):
- Primary HDU (
hdul[0]): First HDU, always present - Extension HDUs (
hdul[1:]): Image or table extensions
hdul.info() # Display all HDUs
# Output:
# No. Name Ver Type Cards Dimensions Format
# 0 PRIMARY 1 PrimaryHDU 220 ()
# 1 SCI 1 ImageHDU 140 (1014, 1014) float32
# 2 ERR 1 ImageHDU 51 (1014, 1014) float32
Accessing HDUs
# By index
primary = hdul[0]
extension1 = hdul[1]
# By name
sci = hdul['SCI']
# By name and version number
sci2 = hdul['SCI', 2] # Second SCI extension
Working with Headers
Reading Header Values
hdu = hdul[0]
header = hdu.header
# Get keyword value (case-insensitive)
observer = header['OBSERVER']
exptime = header['EXPTIME']
# Get with default if missing
filter_name = header.get('FILTER', 'Unknown')
# Access by index
value = header[7] # 8th card's value
Modifying Headers
# Update existing keyword
header['OBSERVER'] = 'Edwin Hubble'
# Add/update with comment
header['OBSERVER'] = ('Edwin Hubble', 'Name of observer')
# Add keyword at specific position
header.insert(5, ('NEWKEY', 'value', 'comment'))
# Add HISTORY and COMMENT
header['HISTORY'] = 'File processed on 2025-01-15'
header['COMMENT'] = 'Note about the data'
# Delete keyword
del header['OLDKEY']
Header Cards
Each keyword is stored as a "card" (80-character record):
# Access full card
card = header.cards[0]
print(f"{card.keyword} = {card.value} / {card.comment}")
# Iterate over all cards
for card in header.cards:
print(f"{card.keyword}: {card.value}")
Working with Image Data
Reading Image Data
# Get data from HDU
data = hdul[1].data # Returns NumPy array
# Data properties
print(data.shape) # e.g., (1024, 1024)
print(data.dtype) # e.g., float32
print(data.min(), data.max())
# Access specific pixels
pixel_value = data[100, 200]
region = data[100:200, 300:400]
Data Operations
Data is a NumPy array, so use standard NumPy operations:
import numpy as np
# Statistics
mean = np.mean(data)
median = np.median(data)
std = np.std(data)
# Modify data
data[data < 0] = 0 # Clip negative values
data = data * gain + bias # Calibration
# Mathematical operations
log_data = np.log10(data)
smoothed = scipy.ndimage.gaussian_filter(data, sigma=2)
Cutouts and Sections
Extract regions without loading entire array:
# Section notation [y_start:y_end, x_start:x_end]
cutout = hdul[1].section[500:600, 700:800]
Creating New FITS Files
Simple Image File
# Create data
data = np.random.random((100, 100))
# Create HDU
hdu = fits.PrimaryHDU(data=data)
# Add header keywords
hdu.header['OBJECT'] = 'Test Image'
hdu.header['EXPTIME'] = 300.0
# Write to file
hdu.writeto('new_image.fits')
# Overwrite if exists
hdu.writeto('new_image.fits', overwrite=True)
Multi-Extension File
# Create primary HDU (can have no data)
primary = fits.PrimaryHDU()
primary.header['TELESCOP'] = 'HST'
# Create image extensions
sci_data = np.ones((100, 100))
sci = fits.ImageHDU(data=sci_data, name='SCI')
err_data = np.ones((100, 100)) * 0.1
err = fits.ImageHDU(data=err_data, name='ERR')
# Combine into HDUList
hdul = fits.HDUList([primary, sci, err])
# Write to file
hdul.writeto('multi_extension.fits')
Working with Table Data
Reading Tables
# Open table
with fits.open('table.fits') as hdul:
table = hdul[1].data # BinTableHDU or TableHDU
# Access columns
ra = table['RA']
dec = table['DEC']
mag = table['MAG']
# Access rows
first_row = table[0]
subset = table[10:20]
# Column info
cols = hdul[1].columns
print(cols.names)
cols.info()
Creating Tables
# Define columns
col1 = fits.Column(name='ID', format='K', array=[1, 2, 3, 4])
col2 = fits.Column(name='RA', format='D', array=[10.5, 11.2, 12.3, 13.1])
col3 = fits.Column(name='DEC', format='D', array=[41.2, 42.1, 43.5, 44.2])
col4 = fits.Column(name='Name', format='20A',
array=['Star1', 'Star2', 'Star3', 'Star4'])
# Create table HDU
table_hdu = fits.BinTableHDU.from_columns([col1, col2, col3, col4])
table_hdu.name = 'CATALOG'
# Write to file
table_hdu.writeto('catalog.fits', overwrite=True)
Column Formats
Common FITS table column formats:
'A': Character string (e.g., '20A' for 20 characters)'L': Logical (boolean)'B': Unsigned byte'I': 16-bit integer'J': 32-bit integer'K': 64-bit integer'E': 32-bit floating point'D': 64-bit floating point
Modifying Existing Files
Update Mode
with fits.open('file.fits', mode='update') as hdul:
# Modify header
hdul[0].header['NEWKEY'] = 'value'
# Modify data
hdul[1].data[100, 100] = 999
# Changes automatically saved when context exits
Append Mode
# Add new extension to existing file
new_data = np.random.random((50, 50))
new_hdu = fits.ImageHDU(data=new_data, name='NEW_EXT')
with fits.open('file.fits', mode='append') as hdul:
hdul.append(new_hdu)
Convenience Functions
For quick operations without managing HDU lists:
# Get data only
data = fits.getdata('file.fits', ext=1)
# Get header only
header = fits.getheader('file.fits', ext=0)
# Get both
data, header = fits.getdata('file.fits', ext=1, header=True)
# Get single keyword value
exptime = fits.getval('file.fits', 'EXPTIME', ext=0)
# Set keyword value
fits.setval('file.fits', 'NEWKEY', value='newvalue', ext=0)
# Write simple file
fits.writeto('output.fits', data, header, overwrite=True)
# Append to file
fits.append('file.fits', data, header)
# Display file info
fits.info('file.fits')
Comparing FITS Files
# Print differences between two files
fits.printdiff('file1.fits', 'file2.fits')
# Compare programmatically
diff = fits.FITSDiff('file1.fits', 'file2.fits')
print(diff.report())
Converting Between Formats
FITS to/from Astropy Table
from astropy.table import Table
# FITS to Table
table = Table.read('catalog.fits')
# Table to FITS
table.write('output.fits', format='fits', overwrite=True)
Best Practices
- Always use context managers (
withstatements) for safe file handling - Avoid modifying structural keywords (SIMPLE, BITPIX, NAXIS, etc.)
- Use memory mapping for large files to conserve RAM
- Use .section for remote files to avoid full downloads
- Check HDU structure with
.info()before accessing data - Verify data types before operations to avoid unexpected behavior
- Use convenience functions for simple one-off operations
Common Issues
Handling Non-Standard FITS
Some files violate FITS standards:
# Ignore verification warnings
hdul = fits.open('bad_file.fits', ignore_missing_end=True)
# Fix non-standard files
hdul = fits.open('bad_file.fits')
hdul.verify('fix') # Try to fix issues
hdul.writeto('fixed_file.fits')
Large File Performance
# Use memory mapping (default)
hdul = fits.open('huge_file.fits', memmap=True)
# For write operations with large arrays, use Dask
import dask.array as da
large_array = da.random.random((10000, 10000))
fits.writeto('output.fits', large_array)