SDSS spectroscopic catalogs
Introduction
The SDSS spectroscopic catalogs contain parameters such as redshift, classification, velocity dispersion, quality flags, and the like, measured from each spectrum. Before reading this page, please make sure you understand the basics of SDSS spectra. For details on the locations of these datasets, see the spectroscopic data access page.
Below we describe how to select "good" spectra, how to exclude duplicates, and the meaning of the most important spectroscopic parameters.
Selecting unique, good spectra
The essential information (redshifts and classifications) of each object are stored in the specObj file (the specObj table in CAS) described below. The other tables contain matching photometric information and detailed measurements of the spectra.
The files contain all spectra, which includes bad spectra as well as repeat spectra. Use the following procedures to select out unique, good spectra:
- To select the best observations of all unique objects, look for objects in specObjAll with "sciencePrimary" (called "specprimary" in the SAS flat files) greater than zero. Note that this is equivalent to the specObj "view" in the database.
- If in addition you want spectra which are on survey quality plates, search for objects where "platequality" is not "bad" (that is, is either "marginal" or "good")
- You might be more interested in the quality of the individual spectrum, in which case the safest indicator is "snMedian"
- You may only be interested in survey quality objects from SDSS Legacy, in which case you should look for objects with legacyPrimary (called "speclegacy" in the SAS flat files) greater than zero.
- You may only be interested in survey quality objects from the SEGUE-1 and SEGUE-2 survey plates, in which case you should look for objects with seguePrimary (called "specsegue" in the SAS flat files) greater than zero.
"sciencePrimary" is basically designed to choose the best available unique set of spectra, in the sense that between two (or more) spectra of the same location on the sky sciencePrimary will only be set for one. Any group of spectra which are within 2 arcsec of each other are considered to be of the same object, and only one of them is called "sciencePrimary". The object picked as primary is the one that satisfies best the following conditions, in decreasing order of importance:
- whether they are on a "primary" observation of a given plate (see below for the technical definition);
- whether the spectrum has a positive signal-to-noise;
- whether the plate they are on is classified as "good";
- whether the redshift determination spawns no warnings in ZWARNING; and,
- in the case that more than one spectrum satisfies all the above conditions equally well, or equally badly, the one with the largest signal-to-noise is selected.
Note that a fiber can be "sciencePrimary" even if it is not in a "primary" MJD for its plate; it just has to be the "best" observation of that location according to the above conditions. For example, if a group of spectra of the same location on the sky are ALL on "bad" plates (failing condition 1), one of them will still be chosen as "sciencePrimary" based on the subsequent criteria. The subsequent criteria are treated similarly.
There are similarly defined quantities seguePrimary
and legacyPrimary
, which apply the above criteria but
restrict to SEGUE plates and Legacy plates, respectively. (So that a
SEGUE observation of an object is never
legacyPrimary
). These can be useful for understanding
window functions and creating homogeneous data sets (though they
differ only a little from sciencePrimary
).
Plate quality and primary plates
The "plates" file (and the plateX table) contain all observations of all plates, including repeats and including some low signal-to-noise plates. You can restrict that list of plates in the following way:
- To select primary, survey quality plates,
look for plates with
isPrimary
greater than zero (IS_PRIMARY
in the flat files) - To select the best observation of each plate, including
plates with no survey quality observations,
look for plates with
isBest
greater than zero - To select primary, survey quality tiles from SDSS Legacy,
look for plates with
isTile
greater than zero - To select primary, survey quality tiles from the SEGUE surveys,
look for plates with
isSegue
greater than zero - To simply check for plate quality, regardless of whether it is a repeat plate, "platequality" classifies plates into "bad", "marginal" and "good".
The PLATEQUALITY string is set for each observation (labeled by its MJD) of each plate. For DR8 plates the definition varies depending on whether the plate is an SDSS plate (that is, has survey set to 'sdss'), a SEGUE-1 plate (that is, has survey set to 'segue1'), or a SEGUE-2 plate (that is, has survey set to 'segue2').
For SDSS plates, the conditions are based on the signal-to-noise and the fraction of bad pixels:
PLATESN2>15 AND FBADPIX<0.05 -> 'good' PLATESN2>9 AND FBADPIX<0.13 -> 'marginal' (if not 'good') otherwise -> 'bad'
For SEGUE-1 plates, the conditions are based on the signal-to-noise of the main sequence turnoff at g=18, except for some special plates:
for faint plates SN of turnoff @ g=18 > 16 for 'good' for bright plates SN of turnoff @ g=18 > 7.5 for 'good' for low-latitude or test plates, consult $SAS_DIR/data/segue1-hand.par
For SEGUE-2 plates, the conditions are also based on the signal-to-noise of the main sequence turnoff at g=18:
median(SN for MS-turnoff @ g=18) > 10 -> 'good' otherwise -> 'bad'
The isPrimary
flag is set for each observation (labeled by its
MJD) of each plate. It is "1" if we consider that MJD to be the best
observation of that plate, and for it to be an acceptable observation
from a science point of view (with PLATEQUALITY either 'marginal' or
'good'). It is "0" either if there is a better observation or if all
observations are labeled 'bad'.
The isBest
flag is set for each observation (labeled
by its MJD) of each plate. It is "1" if we consider that MJD to be the
best observation of that plate, "0" otherwise.
Object information
Each object has a classification (CLASS) and a redshift determination (Z) with an associated error (Z_ERR). For galaxies, a velocity dispersion can be determined (down to about 70 km/s). The redshifts are determined by fitting models to each spectrum assuming a large range of possible redshifts. The best model is chosen on the basis of the chi-squared value of the data with respect to the model.
In addition, there is a bitmask called ZWARNING which has flags set in suspicious cases. A ZWARNING equal to zero indicates no problems identified. Most bits in that mask are signs of substantial problems; the exception is the MANY_OUTLIERS bit, which can be set for successful spectra that either happen to be very high signal-to-noise ratio (e.g. bright stars) or unusual (e.g. some broad-line AGN in galaxies).
The classifications are stored in the CLASS and SUBCLASS parameters. They can take the following values:
- GALAXY: identified with a galaxy template; can have subclasses:
STARFORMING: set based on whether the galaxy has detectable emission lines that are consistent with star-formation according to the criteria:
log10(OIII/Hβ) < 0.7 - 1.2(log10(NII/Hα) + 0.4)
- STARBURST: set if the galaxy is star-forming but has an equivalent width of Hα greater than 50 Å
AGN: set based on whether the galaxy has detectable emission lines that are consistent with being a Seyfert or LINER:
log10(OIII/Hβ) > 0.7 - 1.2(log10(NII/Hα) + 0.4)
- QSO: identified with a QSO template
- STAR: identified with a stellar template, chosen among the following subclasses: O, OB, B6, B9, A0, A0p, F2, F5, F9, G0, G2, G5, K1, K3, K5, K7, M0V, M2V,M1, M2, M3, M4, M5, M6, M7, M8, L0, L1, L2, L3, L4, L5, L5.5, L9, T2, Carbon, Carbon_lines, CarbonWD, CV
If any galaxies or quasars have lines detected at the 10-sigma level with sigmas > 200 km/sec at the 5-sigma level, the indication "BROADLINE" is appended to their subclass.
As examples, the full resolution version of the figure showing the spectra on the spectrum page lists the CLASSes, SUBCLASSes and error flags of these particular spectra.
For each object, there are also detailed quality determinations, a full description of the templates used, and the targeting information. See the CAS table schema or the the specObj file datamodel page for full information.