Schluesseldienst kalk remscheide str

Schluesseldienst kalk remscheide str

{H1}

Authentication of Human and Mouse Cell Lines by Short Tandem Repeat (STR) DNA Genotype Analysis

Abstract

The earliest mammalian cell lines, established by Earle et al. in 1943, were derived from the subcutaneous tissue of a C3H mouse strain and many cell lines still used today were derived from the L strain in those early studies [31]. Several years later, HeLa, the first human cell line was established and since then there has been an increase in the use of human and mouse cell lines as models for cell biology, genetics, diseases such as cancer, for the production of viruses and vaccine development, and as tools for synthesizing recombinant proteins for use as therapeutics. Unfortunately, this accelerated use of cell lines and the lack of best practices in tissue culture have led to increases in cellular cross-contamination, which has resulted in spurious results. Now the simple molecular genotyping technique Short Tandem Repeat (STR) DNA genotyping of human and mouse cell lines is available. If applied routinely in cell culture management this technique, in combination with others, can greatly improve the detection of cellular cross-contamination resulting in more reproducible and meaningful research.

This Assay Guidance Manual (AGM) chapter is an extensive update of the 2013 version entitled Authentication of Human Cell Lines by STR DNA Profiling Analysis by Yvonne Reid, Douglas Storts, Terry Riss, and Lisa Minor [68]. This new document is targeted primarily at researchers who will need to interpret STR genotyping data generated in their laboratory or received from a core facility or commercial testing laboratory for the authentication of mouse and human cell lines. These testing facilities should consult the ANSI-ATCC ASN-0002 revised in 2021 for more detailed guidelines for evaluation and interpretation of STR data [49].

Introduction

Historical Background of Human STR Genotyping

Ever since 1951 there has been a progressive increase in the use of cultured cells as models, substrates, and tools for both basic research and industrial applications. The rapid growth in areas such as cell biology, genomics, and proteomics has triggered a remarkable increase in cell culture activities, resulting in an increase in the potential risk of cross-contamination of cell cultures, which results in the original cells in the culture being replaced by the contaminating cells.

Interspecies and intraspecies cross-contamination among cultured cell lines is a persistent problem and has occurred at frequencies ranging from 6 to 100% [6, 47, 57, 60]. Detection is particularly difficult if co-cultivated cells express similar phenotypes. At one point, the number of cell lines contaminated with HeLa, a cell line derived from an invasive cervical adenocarcinoma in 1951 [39, 44, 53], represented one-third of all human tumor cell lines developed for research in cancer and cell biology. Stanley Gartler showed in 1967-1968 that 18 extensively used cell lines were all derived from HeLa cells [36, 37]. Currently, at least 209 cell lines in the Cellosaurus database are misidentified and have been shown to be HeLa (A. Bairoch, personal communication 2021) [9, 10, 61, 62]. In 1999, Drexler et al. found that 15% of 117 hematopoietic cell lines received from original investigators or a secondary source were cross-contaminated with other cell lines [27]. From the same cell line repository, MacLeod et al. reported that 18% of a collection of 252 cell lines were misidentified [54]. More recent reports have shown many cross-contaminated cell lines purportedly representing, for example, breast [46] and prostate cancers [71, 78-81], thyroid cancer [72], adenoid cystic carcinoma [66], and esophagus[15]. Reviewing published reports of the identities of 3,630 human cell lines, Korch and Varella-Garcia reported an average of 22.5% or 2 of 9 cell lines were misidentified [47]. Cross-contamination of cell lines has persisted as a result of mishandling and a lack of attention to best practices in tissue culture [11, 42, 47, 82]. As Lewis L. Coriell foresaw in 1962 "Like death and taxes, contamination or the threat of contamination is always with us" [24].

The discovery of DNA hypervariable regions within genomes has made it possible to identify each human cell line derived from a single donor. In 1985 Alec Jeffreys and others [43, 59] demonstrated that hypervariable regions, which consist of variable number tandem repeat (VNTR) units from minisatellite DNA, are distributed throughout the genome and can be used to produce a DNA "fingerprint”. These DNA fingerprints are very complex and not easily interpretable. However, subsequent advances in the technology have given rise to the use of Short Tandem Repeats (STR) of microsatellite regions which consist of repeats of core sequences of 1-6 bp ideally located on different chromosomes. The core sequences of these human microsatellite DNAs can serve as hotspots for homologous recombination events which are believed to maintain the variability of these loci [83].

Advances in molecular biology have now made it possible, using STR genotyping (a.k.a. STR DNA profiling), to uniquely identify not only human [7, 27, 49, 54, 58] tissue and cell line samples, but also cell lines derived from African green monkeys [4], dogs [14, 63], rats [17], and mice [3, 5]. In addition, the PCR assays described by Cooper et al. allow the detection of DNA from several commonly cultured mammalian species [23], although to identify the genotype it is necessary to carry out human STR analysis.

Most human cell lines established before 2000 have not been authenticated by comparing their genotypes with those of DNA extracted from tissue samples from which the cell lines were derived, in part because such original samples were not retained. One exception is the melanoma cell line M14, which in 2014 was authenticated by comparing its STR genotype with that of DNA from the patient's serum from 1975 and from a lymphoblastoid cell line derived from the same donor [46]. Because of the high incidence of misidentified cell lines, many researchers are now genotyping samples of the original tissue (e.g., donor blood, tumor cells, xenografts, patient-derived tumor xenografts, FFPE sections) to confirm that the cell lines are derived from the claimed tissue samples [50, 73, 74].

In addition to the STR genotyping, it is important to highlight the other checkpoints for quality cell line management. Mycoplasma frequently contaminate mammalian tissue cultures [28, 30, 55] to the point that in 2014 seven percent of the sequences in the 1000 Genome Project were found to be contaminated with mycoplasma DNA sequences [51] and their presence has previously been shown to affect cell behavior [29]. Therefore, screening for mycoplasma DNA is critical when culturing cells. By combining the above DNA assays, researchers can now not only ascertain whether human cell lines and tissues are derived from a single, specific individual, but also whether their cultures are misidentified or cross-contaminated with the same species or with different mammalian species or mycoplasma. This Assay Guidance Manual (AGM) chapter aims to provide researchers with an understanding of STR analysis of human and mouse cell lines.

NOTE: Please download the pdf version of this chapter for a higher-resolution rendering of these figures.

Principles of PCR Amplification of STR Loci and Identification of PCR Products by Capillary Electrophoresis

Figure 1 outlines a laboratory procedure for authenticating cell lines upon establishment or receipt in a laboratory. Ideally, the STR profiles of cell lines are compared to those of tissue samples at the genome level to determine whether they are derived from the original donor and therefore can be used to model a specific biological process. Table 1 illustrates how to determine at which stage of handling human cell lines and tissue samples during cell line establishment can become misidentified and by what process this occurred. Unfortunately, this cannot be done for most older cell lines because donor tissue samples have not been retained.

Figure 1. . Standard Laboratory Procedure for Establishing Cell Lines.

Figure 1.

Standard Laboratory Procedure for Establishing Cell Lines. Flow chart of the STR DNA genotype analysis procedure for establishing and authenticating cell line identities.

Table 1.

Table 1.

Use of STR Profiles to Determine Scenario by which Human Cell Lines and Tissue Samples Become Misidentified *

Currently, up to 26 STR loci can be examined. To perform STR genotyping, PCR primers are designed to amplify each selected STR loci so that each of the alleles are distinguishable by size. One primer of each pair is labeled with a fluorescent dye. The range of sizes for each STR locus is determined by number of variants that differ in length. Figure 2 below illustrates this process for human STR locus D7S820. From the length of the PCR amplicons, the number of repeats in an STR locus can be deduced and, if necessary, confirmed by sequencing. Capillary electrophoresis (CE) allows length determination of STR PCR products to an accuracy of approximately 0.5 nucleotide by comparison with an internal size standard (ISS). Furthermore, comparing the STR allele length to an allelic ladder allows for accurate allele call determination based on the actual number of repeats.

Figure 2. . Example of Analysis of STR alleles by Sizing of PCR Amplicons.

Figure 2.

Example of Analysis of STR alleles by Sizing of PCR Amplicons. Determination of alleles by sizing of PCR amplicons of the human STR locus D7S820 on chromosome 7 with different tetranucleotide STR repeat sequences. Allele 8 has 8 repeats, allele 9 has (more...)

In STR genotype analysis, the most commonly used human STR loci consist of tetranucleotide repeats (e.g. GATA), but some kits include a few STR loci that have pentanucleotide repeats (e.g. CATGA). The resulting PCR products usually differ by units of four base pair repeats. The alleles can be simply whole numbers (e.g., with 5, 6, 7, 8, 9, 10, 11, 12 repeats as illustrated in Figure 2). In addition, there are variants with partial repeats due to insertions/deletions which lead to a microvariant (less than a full repeat). For example, (GATA)7 GATA is an 8-repeat allele, and (GATA)7 GATA GA is an 8-repeat allele with an extra 2 bp, which would result in a microvariant repeat of an 8.2 allele. Microvariants that have an additional 1, 2, or 3 bp are indicated by these alleles being designated by a number after a decimal, for example, alleles 8.1, 8.2, and 8.3, respectively. In the case of tetranucleotide repeats, there would not be an 8.4 allele since that would be equivalent to a 9-repeat allele.

The PCR amplifications for different STR loci can be combined, i.e., multiplexed, so that a single PCR can test several STR loci. The amplicon loci size ranges with the same dye label are designed so that they do not overlap with amplicons for other STR loci using the shared dye. The PCRs can be further multiplexed by using different fluorescent dyes. Some current capillary instruments can separate up to eight different dyes spectrographically, which allows a great expansion of the number of loci that can be compared by analyzing 3-5 STR loci per dye.

Figure 3 illustrates this separation of the peaks associated with different fluorescent dyes (top panel) and as seen when deconvoluted in their separate channels (middle four panels). Currently, up to six different dyes are commonly used in a multiplex PCR. As a result, most human STR genotyping kits test 16-26 different STR loci.

Figure 3. . Example of Human STR Profile.

Figure 3.

Example of Human STR Profile. Electropherogram of cell line 9947a control DNA using the Thermo Fisher Applied Biosystems AmpFLSTR Identifiler PCR Amplification Kit (Cat No. 4322288). This figure illustrates how the amplified PCR products tagged with different (more...)

The main advantage of genotyping STR loci is that in human populations each locus has many different sequence variants, which is also seen in other mammals. Table 2 (modified from the 2021 revision of the ASN-0002 standard [49]) lists the chromosomal positions, sequence motifs of STR loci, and the number of variants (differentiated by fragment length or sequence) of the 13 STR loci recommended for authentication of human research samples (tissues and cell lines). Unlike single nucleotide polymorphisms (SNPs) with at most four different variants, there are between 19 and 94 different alleles distinguishable by length and up to an additional 57 alleles, which have identical lengths but are distinguishable by their sequence, at these 13 STR loci (see NIST STRBase for detailed information). This enormous variability provides much greater power of distinguishing between individual human samples with only a single 13-locus multiplexed assay than the same number of SNP assays. In summary, the polymorphism or informativeness of these STR markers, which display many variations in the number of the repeating units between alleles and among loci, can be used to distinguish between unrelated cell lines and in some cases variants of the same cell line.

Table 2.

Table 2.

Characteristics of the 13 human reference STR loci used for cell line and tissue authentication*

In a normal diploid cell, there are two alleles at a given locus on a chromosome; one allele derived from the mother (M) and the other allele is derived from the father (F). The inheritance of human STR alleles is illustrated in Figure 4. A progeny from these two parents could inherit either two copies of the 9-repeat allele, one from each parent, and would only show a single peak for the 9-repeat allele, thus being homozygous at the D7S820 locus. Alternatively, as shown in Figure 4, the child inherited the 11-repeat allele from the mother and the 10-repeat allele from the father and thus is heterozygous at this locus.

Figure 4. . Inheritance of Human STR Alleles.

Figure 4.

Inheritance of Human STR Alleles. (A) A mother with STR alleles 9 & 11 in D7S820 and a father with STR alleles 9 & 10 in the same locus can have a child (F1) with alleles 10 & 11 in D7S820. Note, the vertical gray bars indicate (more...)

In contrast to the majority of normal non-germline cells that have two copies of all autosomal chromosomes and all the genes on those chromosomes, cultured cells and tissue samples from tumors, even human pluripotent stem cells (hPSCs) may lose or gain copies of chromosomal segments or even of whole chromosomes [12]. Consequently, populations of cultured cells, which are often passaged (sub-cultured) by small dilutions of a culture, may show loss of heterozygosity (LOH) at diallelic loci due to some cells in the culture having lost an allele. Initially, this may show up by allelic imbalances between the STR alleles with one allele peak being taller than the other (see Figure 5). With time, this imbalance can change to the point that the minor allele is not detected in the culture. Alternatively, the imbalance can be restored to balance or even changed to where the major allele becomes the minor allele as described by Parson et al. [64].

Figure 5. . Allelic Imbalance.

Figure 5.

Allelic Imbalance. Cell line sample showing allelic imbalance at the human D7S820 locus. Data courtesy of CTK.

Characteristics, Utility, and Advantages of Human STR Markers

STR loci selected for cell line authentication are chosen because they display the greatest possible variations for discriminating among human cell lines with the fewest PCR targets and can readily detect cross-contaminating cells. The criteria for selecting STR loci for human cell line authentication include the following characteristics.

  • High discriminating power due to each locus having many alleles

  • High observed heterozygosity >70% in human populations

  • Robust and reproducible PCR results

  • Low stutter characteristics

  • Low mutation rate

  • Allele sizes of PCR products fall in the range of 90-500 bp, which allows easier genotyping of degraded DNA that often falls in this size range.

Advantages of Identification of Cell Lines and Tissue by STR Genotyping

STR analysis is a universally accepted forensic method for human cell line authentication, because it is robust in its ability to identify unique human cell lines, easy to perform, accessible to scientists, and affordable. Some other advantages of STR analysis include:

  • Target sequence consists of microsatellite DNA

  • Typically uses 1-2 ng of genomic DNA

  • 1 or 2 size fragments; discrete alleles allow digital record of data

  • Highly variable within populations, thus highly informative

  • Banding pattern is reproducible

  • Easily PCR amplifiable allowing for high throughput

  • Multiplexing of relatively few PCRs produces highly informative data

  • Small amplicon size ranges allow multiplexing

  • Allelic ladders simplify interpretation

  • Small product sizes (less than 500 bp) are compatible with some degraded DNA samples, such as those extracted from Formalin Fixed Paraffin Embedded (FFPE) samples. Commercially, there are special kits available for very degraded samples that produce DNA fragments shorter than those made with the kits used for general STR genotyping.

  • Rapid processing is attainable.

Below in Step 5 - NIST Protocol for Authentication of Mouse Samples by STR Analysis we provide an example of how to use a human STR genotyping kit.

Authentication of Mouse Samples by STR Analysis

Introduction

Authentication of mouse cell lines and tissue samples present a rather unique problem; namely, many mouse strains are inbred to be as isogenic as possible. As a consequence, in contrast to human samples, it is very difficult to differentiate between individual cell lines and tissue samples from these strains since they were often derived from clones of the same individual. A group at the National Institute of Standards and Technology has developed a mouse cell line authentication method using mouse-specific STR loci that allows the identification of many different mouse strains derived from these inbred lines and from non-inbred strains [1-3, 5]. The method was validated by the Mouse Cell Line Authentication Consortium in an interlaboratory study to test 50 commonly used mouse cell lines [3] and individual, unique profiles were obtained from non-related cell lines. This method is licensed and currently available through several cell line authentication service providers.

Mouse STR Markers

A multiplex PCR assay that targets eighteen mouse-specific tetranucleotide STR markers and two human STR markers, for contamination detection, was developed for mouse cell line authentication. The eighteen mouse loci in this assay include 1-1, 1-2, 2-1, 3-2, 4-2, 5-5, 6-4, 6-7, 7-1, 8-1, 11-2, 12-1, 13-1, 15-3, 17-2, 18-3, 19-2, and X-1.

These mouse loci mainly consist of simple tetranucleotide repeats as shown in Table 3; however, there are a few loci that exhibit more complex repeats, such as marker 7-1. In cases like STR 7-1, two samples could present the same fragment length at this marker but have very different sequences. Sequencing is a useful tool when more resolution is needed to discriminate between two samples. In addition to repeat motifs, Table 3 also lists known allele ranges and associated fragment lengths for each STR marker. Note that microvariants are present in the allele ranges and are very common in mouse STR profiles.

Table 3.

Table 3.

Characteristics of the 18 Mouse STR Loci [2, 3]

Figure 6 presents an electropherogram of an STR of the mouse cell line P19. The two marker regions in gray in the second and third panels are for two human loci (D4S2408, and D8S1106) as described in Step 5, wherein we outline a standardized method for the authentication of mouse cell lines by STR genotyping. This link has detailed information on mouse cell line P19 and a higher resolution image of the electropherogram.

Figure 6.

Figure 6.

Electropherogram of P19 Mouse Cell Line STR Profile (ATCC# CRL-1825)

STR DNA genotyping assays for authentication of human and mouse cell line identities and detection of cross-contaminating intraspecies cells

Targeted user group for this AGM chapter on STR analysis

This AGM chapter will outline for researchers how to STR genotype human and mouse DNA samples and how to interpret the test results. A much fuller explanation of STR genotype analysis is presented in the ANSI- ATCC ASN-0002 Standard protocol revised in 2021 [49], which contains many more examples and explanations of various aspects of STR genotype analysis of cell lines. In contrast to this AGM chapter, the ASN-0002-2021 Standard is targeted to both the technical personnel performing the assay and the researcher interpreting the results. The principles and main steps of the STR assays described herein are applicable to the authentication of cell lines by identification using STR genotyping.

STR analysis can be performed in most laboratories that have the capabilities to execute molecular techniques if they have access to a capillary gel electrophoresis instrument and appropriate data collection software (for example through a core laboratory) for data analysis. In most cases however, laboratories submit DNA samples or even cell line cultures to core facilities or commercial DNA testing laboratories for testing, which send the results back to the researcher. Different levels of analyses of the data are often offered as an additional service.

It can be useful for the research labs to learn how to interpret the data themselves so they can understand the significance and reliability of the data. This understanding is a principal goal of this AGM chapter. Therefore, if the labs choose to perform the STR assays in-house or to receive the raw STR data, the genotypes can be determined either with commercially available software or with the free NCBI software package OSIRIS. This procedure is an easy, low cost, and reliable method for the authentication of human and mouse cell lines using the commercially available kits and/or services.

DISCLAIMER

Certain commercial equipment, instruments, or materials (or suppliers, or software, etc.) are identified in this AGM chapter to foster understanding. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, the authors, or the National Center for Advancing Translational Sciences (NCATS). Also, it does not imply that the materials or equipment identified are necessarily the best available for the purpose. Below we use a Promega STR genotyping kit and software packages from different vendors only to exemplify the STR procedure, but we do not advocate the use of any one product over another. Refer to the manufacturers' manuals for detailed instructions on using the specific kits and instruments that are available.

Outline of STR Genotyping Assay Steps

STR genotyping encompasses seven of the nine following laboratory steps depending on the whether the samples are of human (Steps 1-4 & 7-8) or mouse (Steps 1-2, 5-8) origin.

1.

Collection of human or mouse samples of cell line cultures, tissue samples (e.g., from related fresh or frozen tumor samples, FFPE sections, xenografts, peripheral blood mononuclear cells, serum) for DNA extraction;

2.

DNA extraction, purification, and quantification from one or more of the above samples;

3.

Choosing a kit for the authentication of human DNA samples;

4.

Multiplex PCR amplification of human STR loci, usually in a single reaction for each DNA sample. Included are both a negative control PCR without DNA and a PCR with DNA from a reference sample with known alleles;

5.

Choosing a kit for the authentication of mouse DNA samples;

6.

Multiplex PCR amplification of mouse STR loci, usually in a single reaction for each DNA sample. Included are both a negative control PCR without DNA and a PCR with DNA from a reference sample with known alleles;

7.

Capillary gel electrophoresis (CE) to separate the different STR amplicons that are tagged with different fluorophores. In the 1990s and early 2000s slab gels were used, but they did not provide as accurate sizing of amplicons that current CE instruments do;

To each sample a size ladder of fragments with a different fluorophore is added to determine the sizes of allelic amplicons. In addition, an allelic ladder containing amplicons of known allele sizes is electrophoresed with a size ladder in the same experiment in order to identify STR alleles in the test samples;

8.

Allele calling software (e.g., Gene Mapper from Applied Biosystems/Thermo Fisher Scientific, Gene Marker from SoftGenetics, or OSIRIS from the National Library of Medicine/National Center for Biotechnology Information) is used to identify the alleles present in each sample and to confirm that no alleles are present in the No DNA control sample, and that the alleles in the reference sample and allelic ladder are called correctly;

After the collection of the STR data, the data are analyzed and interpreted by comparing the allele calls of each of the test samples to reference profiles as described in the Data Analysis and Interpretation section. For example, a cell line profile is compared to STR profiles for the cell line in an STR database (e.g., Cellosaurus, one available from cell line repositories, a published profile by the originator of cell line, an in-house database) or ideally to an STR profile of DNA isolated from a sample of tissue (fresh, frozen, FFPE, xenograft, blood) from which the cell line was established. This allows determination whether the cell line or tissue (e.g., xenograft patient-derived tumor xenograft) sample being used for experiments is from the original donor.

Step 1 - Compiling cell line information and management prior to use

When a cell line is first received into the laboratory it is essential to capture as much information as possible on its history, growth, and functional characteristics, including what is known about its true identity. This information is important for tracking the behavior of a cell line during culturing, its characterization, whether it is appropriate for the experiments, if it is known to be or has been misidentified prior to being received by a laboratory, or a culture of it has become cross-contaminated during the experiments (see Figure 1), and finally for correctly reporting which cell line and/or tissue samples were used for the reported experiments.

Cellosaurus [9, 10] is a rich resource with descriptions for 130,952 cell lines, including 97,714 from humans, 22,740 from mouse, 2,493 from rat, and 863 from dog (version 39, October 2021). The International Cell Line Authentication Committee (ICLAC) is continuously updating its register of misidentified cell lines (mostly human samples, see ICLAC.org/cross-contamination/ and [22]) and this information is incorporated into the Cellosaurus database. Therefore, prior to using any cell line for research, both of these resources should be consulted for the latest information about its identity and characteristics, so as not to waste effort and research funds on using misidentified cell lines. The following information should be compiled from these resources and/or from the original publications:

  • Name of cell line

  • Cellosaurus ID number if available

  • Name of cell line source

  • Name of cell line originator

  • Date cell line was established

  • Reference(s) describing the establishment of the cell line

  • Tissue of origin

  • Species

  • Population doubling levels

  • Unique characteristics and function

  • Complete growth medium

  • Doubling time

After collecting this information, it should be stored in an in-house laboratory database or commercial software package (e.g., the cell line tracking software Find Cell by Find Genomics, [86]) for consultation when troubleshooting abnormal observations or planning to use the cell line or tissue sample for subsequent experiments. Figure 7 outlines a plan and standardized procedures which should be established in-house for managing the cells during expansion and their use for experiments, such as the creation of a Seed stock (Master Cell Bank) and a Distribution stock (Working Cell Bank).

Figure 7. . Standard Laboratory Procedure for Handling Cell Lines.

Figure 7.

Standard Laboratory Procedure for Handling Cell Lines. Follow this procedure to ensure the use of valid cell lines and tissue samples by cell line authentication.

Upon thawing, spot approximately 20 µL of cell suspension containing 1 x 106 cells / mL from the donor vial directly onto FTA paper for subsequent STR analysis to establish the baseline STR profile. Alternatively, freeze at -20°C a similar or larger aliquot of cells in a microfuge tube for subsequent DNA extraction and STR analysis. The baseline STR profile from the original donor material should be determined prior to starting experiments and then used to compare against all subsequent STR DNA profiles performed on the various cell banks.

The remaining cells are expanded to create a Seed stock from which a Distribution stock is prepared. Representative vials from both the Seed and Distribution stocks are subjected to another round of STR DNA profile analysis in addition to other quality control procedures. The STR DNA profiles for both Seed and Distribution stocks are compared to the baseline profile of the original donor material.

STR DNA profile analysis involves the simultaneous amplification of STR markers plus a locus for gender determination. The amplicons from the PCR are resolved by capillary electrophoresis and sized using internal size standards (ISS). The sized fragments are then converted into alleles with comparison to the allelic ladders and the assigned alleles are converted to numeric values which are used to create a baseline profile. The baseline DNA profiles are used to create a reference database. All subsequent STR DNA profile analyses performed on the various cell line samples are compared to the baseline profile of that cell line in the reference database. The STR DNA profile should also be compared to profiles of other cell lines in Cellosaurus and any in-house reference database to determine if the results from these quality control tests meet the acceptance criteria between the cell line and its original tissue (or its derivatives).

Step 3 - Commercially-available Human STR Analysis Kits

Human STR analysis was developed for forensic uses and several STR analysis kits are commercially available from different manufacturers. These are well validated for normal human testing. The most widely available are those from Promega Corporation, Thermo Fisher Scientific/Applied Biosystem, and Qiagen. These and many other kits are listed in the 2021 revision of the ASN-0002 standard [49]. These kits are also used for authentication of human tissues and cell lines, but as discussed below, cell lines present some unique differences that are not encountered with normal tissue samples.

For the authentication of cell lines, the 2021 revision of the ASN-0002 standard [49] recommends the testing of a common gender determining locus (amelogenin) and the following 13 STR loci as a minimum number needed for the identification of cell lines and tissue samples:

  • Amelogenin (gender determination gene on the short arms of the X and Y chromosomes, with the X-linked allele being 6 base pairs shorter than the Y-linked allele), and

  • The following STR loci: CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, FGA, TH01, TPOX, and vWA, which are dispersed on different human chromosomes.

For determining the gender of the human tissue donor, the amelogenin is the most commonly used gene. Copies of this gene are present on both the X and the Y chromosomes. AMELX is located in the Xp22.1-Xp22.3 region and AMELY is in the Yp 11.2 region. In intron 3 of AMELX, there is a 6 bp deletion of AAAGTG, while in AMELY these 6 bp are present. Primers have been designed to co-amplify this region from both chromosomes and by PCR two products are produced that differ by 6 bp. A normal germline DNA sample from a female will produce only the shorter product from the two X chromosomes; while PCR of a germline DNA sample from a normal male will produce two amplicons that differ by 6 bp, one from the X chromosome and one from the Y chromosome.

As discussed extensively in the ASN-0002 2021 revision [49], males can infrequently lose this portion of the Y chromosome (6 of 29,432) and as a consequence appear by this amelogenin PCR test to be genetically female although they are not infertile. On comparing a large collection of cell lines, Liang-Chu reported finding that about 45% of cell lines reportedly derived from males did not have the AMELY allele [52]. Unfortunately, the identities of most older cell lines have not been confirmed by STR or other genetic analyses to be from the purported donor, so the true identities of these AMELY-minus cell lines cannot be validated. As a consequence, although this locus is useful to identify a cell line as being derived from a male donor, the absence of the AMELY allele does not prove that the cell line was derived from a female. An example of this is the cell line M14, which is derived from a melanoma on a male patient. This cell line was thought to be from a female (and called MDA-MB-435 or more correctly MDA-MB-435S), since it did not have an AMELY allele and had two X chromosomes. However, it has been clearly shown that despite being extensively used as a model of female breast cancer, it was actually from a male donor and a large part of the Y chromosome (including the AMELY locus and the centromere) had been lost [46] and the X chromosome had been duplicated. To partially address such genetic changes, some newer STR kits have added other Y-linked loci to their STR kits and some forensic STR kits are designed specifically to test for different segments of the Y chromosome (e.g., AGCU ScienTech Inc. AGCU Expressmarker 16+18Y STR kit with 18 Y-STR loci or the Promega VersaPlex 27PY System with three Y-STR loci, or Applied Biosystems Yfiler Plus PCR Amplification Kit with 27 STR loci).

Step 4 - Authentication Protocol for Human Samples by STR Analysis

This technology is capable of discriminating between two human cell lines originating from different individuals, with random probabilities of identity (POI) ranging between 1 x 10-15 and 3 x 10-15 (see Chapter 10 in reference [19] for calculation method).

As mentioned above, different kits from various suppliers can be used with assorted markers for the analysis of STRs of purified DNA. A commonly used kit is the Powerflex 18D (Cat. No. DC1802) from Promega Corporation. Using the DNA isolated as described above, this kit allows co-amplification and four-color fluorescent detection of eighteen loci; namely, amelogenin, CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, FGA, TH01, TPOX, vWA plus four additional loci: D2S1338, D19S433, Penta D, and Penta E. A fifth fluorescent dye is used for the size ladder. The system is optimized for analysis of common laboratory cell suspension samples and other samples as outlined above from almost any source of human DNA. Ideally, the isolated gDNA should consist of fragments greater than 1 kb in length. Genomic DNA from FFPE samples is often shorter than this minimum and may require modification of the protocols below. As mentioned previously, other STR genotyping kits are available such as the AmpFLSTR Identifiler kit from Thermo Fisher Scientific/Applied Biosystems that tests for all loci above, except Penta D and Penta E (STR profile shown in Figure 3).

The PowerPlex 18D System is compatible with the ABI PRISM 3100 and 3100-Avant and Applied Biosystems 3130, 3130xl, 3500, 3500xL, 3730, 3730XL, SeqStudio, and Promega's Spectrum Compact CE System Genetic Analyzers. For specific and detailed protocols, refer to the user manuals available from the kit and instrument manufacturers.

Materials

  • GeneAmp PCR System 9700 thermal cycler (Applied Biosystems) or equivalent.

  • MicroAmp optical 96-well reaction plate (Applied Biosystems) or equivalent.

  • Microcentrifuge with rotor(s) specific for 1.7 mL and 0.2 mL tubes and for 96-well plates.

  • Aerosol-resistant pipet tips.

  • Microfuge tubes 1.5 mL with frosted caps and 0.2 mL PCR tubes with domed caps (or equivalent). Note that if left in the original plastic bags they are usually sufficiently clean to avoid contamination of samples by DNA. Handle with clean forceps and gloves.

  • NOTE - Do not autoclave the plastic ware as the heat can deform tubes and contaminate the plastic.

  • 1.2 mm Micro-Punch and cutting mat.

Table 4A and Table 4B provide a listing of the pre- and post-amplification components and long-term storage instruction for the human STR profiling kit PowerPlex 18D System.

Table 4A.

Table 4A.

Human PowerPlex 18D System Pre-amplification Components and Storage

Table 4B.

Table 4B.

Human PowerPlex 18D System Post-amplification Components and Storage

Amplification setup

1.

Completely thaw and vortex the PowerPlex D 5X Master Mix and PowerPlex 18D 5X Primer Pair Mix.

2.

Determine the number of reactions needed. Note: Include negative and positive controls and add enough for an additional 1 or 2 reactions to compensate for pipetting errors.

3.

For reaction assembly, label a clean 0.2 mL MicroAmp plate or individual 0.2 mL PCR tubes with domed caps (or equivalent) for each reaction.

4.

Vortex the PCR amplification master mix and the primer mix for 10 seconds.

5.

Depending on the type of DNA, add reagents in the order listed in Table 5A for FTA samples or Table 5B for DNA purified from tissue or cell lines to each sterile tube or well.

6.

Add either 25 µL PCR amplification mix to each of the reaction tubes containing FTA disks (Table 5A) or combine 1 ng of purified DNA up to a maximum volume of 15 µL with the individual mixtures in Table 5B.

7.

For the positive amplification control:

a.

Vortex the tube of 2800M Control DNA.

b.

Dilute 2800 M Control DNA to a concentration of 5 ng/µL.

c.

Pipet 1 µL of diluted Control DNA into reaction well or tube containing 25 µL of PCR amplification mix.

d.

For the negative amplification control use either a reaction well or tube containing 25 µL amplification mix without DNA or a reaction well containing 25 µL amplification mix with an FTA disk without DNA.

8.

Seal the plate or individual tubes and briefly centrifuge the plate/tubes to bring the disks to the bottom of the wells.

Table 5A.

Table 5A.

Human PCR Amplification Setup Reaction Volumes for Samples on FTA Cards

Table 5B.

Table 5B.

Human PCR Amplification Setup Reaction Volumes Using Samples of Purified DNA

Note 1: The number of cells spotted onto the FTA card and the amount of Control DNA used to obtain positive results must be empirically predetermined.

Note 2: Ensure that the cells are uniformly suspended with no clumps of cells present prior to applying to the FTA card, otherwise different punches of the same FTA card can produce variable profiles.

Note 3: Individual 0.2 mL PCR tubes with domed caps (or equivalent) are less wasteful than 96-well plates when only a few reactions are to be performed and their use can reduce the chances of sample contamination.

Thermal cycling

1.

Place the MicroAmp plate or individual tubes in the thermal cycler.

2.

Select and run the recommended protocol provided in Table 6.

3.

Store amplified samples at 4°C or freeze after completion of the thermal cycling protocol.

4.

Optimize the cycle number and injection conditions of the protocol based on the kits and instruments used, starting from the conditions recommended by the manufacturers of the kits and instruments. Also, be aware that some thermal cyclers have variable ramp speeds and this can affect the yield and quality of the PCR products.

Table 6.

Table 6.

Example of Thermal Cycling Protocol

NOTES

  • The temperatures of these steps and the number of recommended cycles may vary between different kits from different manufacturers.

  • The duration of Step 4 is used to drive the complete addition of an Adenine (A) nucleotide by the terminal transferase activity of the DNA polymerase to the 3' end of the PCR amplicons. The addition of this terminal A will depend on the kit and DNA polymerase used and the amount of DNA in the PCR. Excessive DNA can cause incomplete A addition. The optimal time for Step 4 should be validated for the kit being used to minimize the presence of N-1 bp products. Many laboratories use 40-60 minutes for this step.

Step 5 - NIST Protocol for Authentication of Mouse Samples by STR Analysis

In addition to the mouse STR loci described above, two human STR markers, D4S2408 and D8S1106 (Table 7), which were previously described to identify both human and African green monkey DNA [4, 41] are included in the NIST STR genotyping method for mouse samples to check for the presence of human DNA in the samples. The mouse-specific primers target STR markers found on 16 different chromosomes, with redundancy at chromosomes 1 (>164 Mb apart) and 6 (>90 Mb apart). In humans, alleles greater than 50 Mb apart are considered unlinked [18]. The forward primers are labeled at the 5’ end with one of the following fluorescent dyes: 6-FAM, VIC, NED, or PET dye (Table 7).

Table 7.

Table 7.

Multiplex PCR primers for Mouse Cell Line Authentication [2, 3]

To promote complete adenylation and reduce troublesome artifacts during data analysis, a guanine base (G) or a ‘‘PIGtail” sequence (note the underlined sequence GTTTCTT of the reverse primers, Table 7) was added to the 5’ end of the unlabeled reverse primer [16]. Highly purified primers should be ordered (e.g., HPLC-purified) to reduce the artifacts produced during capillary electrophoresis (e. g., dye blobs). Using 100 µM primer stock solutions, prepare the multiplex primer mixture by adding to a single tube containing 20.5 µL of TE (10 mM Tris-HCl pH8.0, 1 mM EDTA) the amounts of the forward and reverse primers for each STR marker indicated in Table 8.

Table 8. . Preparation of Mouse Multiplex Primer Mixture (100 µL total volume).

Table 8.

Preparation of Mouse Multiplex Primer Mixture (100 µL total volume). Scale up as necessary

Qiagen Type-It Microsatellite PCR Kit (Cat. No. 206243) was used in the interlaboratory study to validate this method [3]. The internal lane size standard, GeneScan LIZ 600 v2.0 (Thermo Fisher Scientific Cat. No. 4408399) was used to assign fragment length to amplicons and the calibrants, used in lieu of a mouse allelic ladder, were successfully used for accurate allele call determination. A five-dye matrix standard, DS-33 (Thermo Fisher Cat. No. 4345833) was used to perform spectral calibrations on the Genetic Analyzers prior to running amplified products. The NIST Reference Material 8399 for a Mouse Allelic Ladder will be released soon.

Step 6 - Authentication Protocol for Mouse Samples by STR Analysis

Use the above mixture of primers (Table 8, for additional information see Almeida et al. [2, 3]) for the PCR amplifications of the mouse STR analyses as described below. Table 9 and Table 10 describe the pre- and post-amplification solutions and components and their recommended storage conditions for these STR reactions.

Table 9.

Table 9.

Mouse Microsatellite PCR Qiagen Type-It Kit Pre-amplification Components and Storage

Table 10.

Table 10.

Mouse PCR Post-amplification Components and Storage

Amplification setup for mouse STR assay

1.

Completely thaw the Master Mix and PCR grade water from the Qiagen Type-It Microsatellite PCR Kit (Cat. No. 206243).

2.

Determine the number of reactions needed. Note: Include a negative control, two positive controls (one mouse and one human), and an additional 2 samples for pipetting error (use Table 11 below).

3.

Prepare PCR master mix of components detailed in Table 11 and vortex for 10 seconds. Briefly spin down.

4.

Aliquot 24 µL of master mix into each 0.2 mL tube or well in 96-well plate, or equivalent.

5.

Add 1 µL of quantified mouse DNA at 2 ng/µL to appropriate sample tube/well. For the negative control add 1 µL of water to the reaction. For positive controls add 1 µL of 2 ng/µL of mouse DNA with a known STR profile to one tube and 1 µL of 2 ng/µL of human DNA sample as a contamination control to a separate tube.

6.

Mix briefly and spin down sample to the bottom of the tube or plate.

Table 11.

Table 11.

Amplification Setup for Mouse PCR Reaction Volumes

Thermal cycling

1.

Place the 0.2 mL tubes or 96 well plate on the thermal cycler.

2.

Select and run the recommended protocol provided in Table 12. Ramp rate should be set to 3.35 °C/sec, if possible. The assay must be validated on the specific thermal cycler being used and the selected ramp rate.

3.

Store amplified samples at 4°C after completion of the thermal cycling protocol.

Table 12.

Table 12.

Mouse Multiplex PCR Thermal Cycling Protocol

Step 7 - Detection of Amplified Fragments using Automated Capillary Gel Electrophoresis Instruments

Each forward primer used in these multiplex PCR assays for cell line authentication has a fluorescent dye molecule covalently linked at its 5' end. Multiple sets of primers with different "color" fluorescent labels are used to analyze numerous different loci in a single multiplexed PCR reaction. Following the PCR step, internal size standards and deionized formamide are added to the reaction mixture, heat denatured, centrifuged to remove bubbles, and the DNAs are separated by size via capillary gel electrophoresis.

Several different automated capillary gel electrophoresis instruments can be used, such as Thermo Fisher Scientific / Applied Biosystems 3500, 3500XL, 3730, 3730XL, 3130, 3130XL, SeqStudio Genetic Analyzer, or Promega Spectrum Compact CE or Spectrum CE Systems.

Once the data are collected, the software (such as GeneMapper (versions 4, 5, or 6), GeneMapper ID-X from Thermo Fisher Scientific / Applied Biosystems, OSIRIS from the National Library of Medicine, and GeneMarker from SoftGenetics) is used to size the amplicons based on the internal size standard. STR genotyping of each cell line is performed by converting amplicons size to alleles by comparing to allelic ladders [7, 21, 49]. See Table 13 for a listing of the reagents and equipment that can be used in this assay.

Sample preparation

1.

Preparation of loading cocktail:

For human STR genotyping, combine and mix the internal size standard (e.g., Promega CC5 Internal Lane Standard 500 with the Promega Powerplex 18D kit or GeneScan 600 LIZ with an Applied Biosystems kit, Figure 8) and Hi-Di formamide as for example:

(1 µL CC5 ILS 500) × (number of samples) + (10 µL Hi-DiTM formamide) × (number of samples) = amount of loading cocktail

For mouse STR genotyping, combine and mix 0.5 µL GeneScan 600 LIZ internal size standard with 9.5 µL of Hi-Di formamide for each sample (prepare a master mix of this for all samples).

Note: the amount of internal lane size standard (ISS) for both human and mouse STR genotyping may need to be adjusted so the ISS peaks are of sufficient signal intensities that their sizes are correctly called, as illustrated in Figure 8 and the bottom panel of Figure 3, and described in more detail in the section Evaluation of Internal Lane Size Standards (ISS) and Allelic Ladders.

2.

Vortex loading cocktail (formamide/lane standard mix).

3.

Pipet 11 µL of loading cocktail into each well for human samples (10 µL of loading cocktail for mouse).

4.

Add 1 µL of amplified sample or negative control to appropriate well. For each injection, add 1µl of PowerPlex 18D Allelic Ladder Mix to a single well for human samples or 1 µL of mouse allelic ladder.

5.

Cover wells with appropriate septa.

6.

Centrifuge plate(s) briefly to remove bubbles from the wells.

7.

Denature samples at 95°C for 3 minutes just prior to loading instrument.

8.

Immediately chill plate on crushed ice or an ice-water bath for 3 minutes.

Figure 8. . Suitable Internal Size Standard (ISS).

Figure 8.

Suitable Internal Size Standard (ISS). This example of an ISS shows peaks of DNA fragments 100 to 360 nucleotides in length from the Applied Biosystems GeneScan 600 LIZ size standard.

Instrument preparation and use

Detect amplified fragments using one of the capillary gel electrophoresis instruments specified above. Refer to the User Guide for use and care of the instrument and the manufacturer's (e.g., Promega) for a detailed protocol of data analysis.

Step 8 – Allele Calling

To analyze human STR data produced with an STR kit, such as the PowerPlex 18D kit, the GeneMapper ID-X software version 1.2 or later can be used in combination with specific Panel, Bins, and Stutter text files, which can be obtained from www.promega.com/geneticidtools/panels_bins/. This will allow the automatic assignment of human genotypes by the software analysis of the fragment electrophoretic data .fsa files. Alternatively, the calling of the STR alleles can be determined using one of the Applied Biosystems gel electrophoresis instruments (e.g., 3130, 3500, 3730, SeqStudio) or the Promega Spectrum Compact CE System genetic analyzers in combination with one of the following programs: GeneMapper version 5 for Windows 7 or version 6 for Windows 10 from Applied Biosystems, GeneMarker for Windows 7/8/10 from Softgenetics, or OSIRIS from the National Center for Biotechnology Information at the NIH National Library of Medicine (NLM).

Software analysis using GeneMapper, GeneMapper ID-X, and GeneMarker have been previously used in the validation for mouse cell line authentication [3]. Bins and panels for the mouse multiplex assay can be accessed from Almeida et al. [3] in their supplemental section (S4 File), or can be downloaded in a zip file. Using the mouse allelic ladder, align the bins for the associated platform (specific array, polymer, and instrument) prior to data analysis.

Supplemental File 1 illustrates how to adjust allele calling software settings for optimal visualization of STR genotyping data using, for example, GeneMapper 6 Software for human STR kits.

Data Analysis and Interpretation

Interpretation of Human and Mouse Sample STR Genotyping Data

Researchers should compile from their in-house STR genotyping experiment or receive from the STR testing facility the following information: a table in CSV or XLSX format of the STR results for each sample, plus that for the three controls that were run for quality purposes: 1) the allelic ladder, 2) the positive control DNA sample(s), and 3) the No DNA negative control. In addition, they should print out or obtain PDF copies of the electropherograms of the analyzed data of the samples and all the controls so the data can be closely examined. The negative control will show whether the reaction mixture was contaminated with extraneous DNA and how well the analysis software corrects for background noise, the positive control and the allelic ladder (see Figure 8) will show whether the allele calling was correct. If an external facility was used, these controls should be evaluated by them prior to distributing the data to the researcher. Receiving all the data, will allow the researcher to confirm that the results do not show any technical errors or artifacts, such as those mentioned above or below in the Data Analysis and Interpretation and Criteria for Determining Quality STR Profile Analysis for Reliable and Interpretable Results.

Because of the LOH seen in cultured cell lines, unlike reporting of forensic STR data, the cell line STR profiles should only report the actual peaks seen; namely, single peaks in the STR profiles should be reported as single alleles (e.g., 9) and not double alleles (not 9,9 and not X,X for a female). The doubling of alleles, as done in forensic STR analysis, is based on an assumption that the samples are normal diploid cells. However, this doubling can bias (both in favor and against) matching cell line STR profiles and thus confuse the identification of both pure cell line cultures and of mixed cell line cultures. The counting of only the alleles seen is standard practice as demonstrated not only in the two versions of the ANSI-ATCC STR profiling standard [7, 49], but also it is the procedure done by others [8, 21, 33, 52, 58, 69, 85].

On receipt of the data, the researcher must compare the results with STR profiles of the original sample or those expected for the tested sample. There are several algorithms for comparing two STR profiles. These are discussed in great detail in Chapter 6 and Appendix E of the revised ASN-0002 standard [49]. In 1999, Tanabe et al. [77] modified the Sørensen formula [75] to its presently used format as described by Capes-Davis et al. [21]. Originally, Sørensen divided the number of shared characteristics by the average number of characteristics for the two samples. In the Tanabe version of the formula, the number of shared alleles is instead multiplied by 2, the product of which is divided by the total number alleles in the two samples at shared loci. Loci which were not tested in both samples or loci that gave no alleles ("no-calls") are not included in the calculation of percent match.

In 2001, Masters et al. [58] proposed a simpler formula for calculating a % Match. In this formula, the number of shared alleles at shared loci (neither of which had any no-calls) is divided by the number of alleles in the query at shared loci as shown here.

Subsequently, Capes-Davis et al. evaluated the use of an alternative version of the Masters et al. In this % Match algorithm the denominator is instead the number of alleles in the reference profiles as shown below.

Between 2001 and 2020, alternative algorithms have been proposed as described in Appendix E of ASN-0002 of 2021 [49]. In all of the above algorithms it should be understood that the profile with the fewest alleles limits the percent match. This becomes important where one sample has significantly more alleles than the other sample(s). Cases where this is more likely to occur are when: (a) a mixture of two cell lines is compared to a pure sample of a cell line, (b) one of the samples being compared has undergone extensive loss of heterozygosity (LOH) due to genetic drift as commonly occurs in cancer cell lines, or (c) the cell line shows extra peaks due to being genetically unstable, for example, because it has a defect in the DNA mismatch repair system and manifests microsatellite instability (MSI; see Parson et al. [64]). To deal with these situations it is best to compare samples using all three algorithms. The Cellosaurus website provides these options in their CLASTR cell line comparison tool for human, mouse, and dog STR profiles.

Masters et al. [58] using their original algorithm proposed that a score of 80% or greater would identify human samples with identical or closely related profiles. This threshold value has been often incorrectly interpreted to mean that two cell line samples with ≥80% match are the same or even identical. These comparisons only mean that the samples are most likely related because they were derived from the same donor.

Bady et al. [8] and another group [52, 85] recognized that this cut-off level was insufficient for identifying related human samples when using data from only eight STR loci and recommended using results from a minimum 15 STR loci when comparing samples, preferably with cut-off scores of >90% be used "to be absolutely certain of a match" as stated by Yu et al. [85]. The revised standard ASN-0002 of 2021 recommends that at least 13 human STR loci be used with a match score of <70% indicating two samples are very unlikely to be from the same donor and strongly suggest one of the samples is misidentified. Scores between 70% and 79% for cell lines known to be related could be because some significant genetic drift has occurred (e.g., due to MSI), or that one of the samples is a mixture of two or more cell lines. It may be possible to resolve such a mixture into its component cell lines by using the Alt – Masters % Match algorithm.

After the publication in 2011 of the first version of ANSI-ATCC ASN-0002 standard [7], a score of 80-89% for eight STR loci including the amelogenin locus in the scoring algorithm was used to indicate that two human samples were likely to be from the same donor and a score ≥ 90% indicated that the samples were most likely from the same donor. These scoring criteria have been revised in the 2021 revision of this standard for cell line authentication by STR profiling [49]. Currently, a minimum of thirteen core STR loci is needed to uniquely identify a human cell line. The thirteen loci are: CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, FGA, TH01, TPOX, vWA. The amelogenin locus is not to be included in the scoring of match, but only to identify potentially misidentified cell lines, for example, those that are purportedly from a female, but which bear an AMELY allele.

Cautionary Note

Using the match algorithms described above, a common guideline has been if the human STR profiles have match scores of ≥80% at 13 or more STR loci (not counting the amelogenin loci on the X and Y human sex chromosomes), then the two samples are derived from the same donor. It is important to note that it does not mean that to two samples are identical. The 13 STR loci encompass only a very small portion of the human genome (< 0.0004%) and therefore many changes in the genome from the original sample can have occurred and still the two samples may show a 100% matching STR profiles. For instance, Kleensang et al. reported that two aliquots of the same batch of the breast cancer cell line MCF-7 (obtained from a commercial cell line repository), which differed by only 3 passages and which had identical STR profiles at 8 loci, showed considerable differences in phenotypes [45]. Ben-David et al. expanded this understanding of the genetic drifting process for the same cell line [13]. Therefore, it is important to understand that such scores only indicate that two samples are related by having been derived from the same donor, but that they are not necessarily identical and can have genetically drifted (and therefore phenotypically) significantly from one another and altered their utility as cell line models.

Using CLASTR of Cellosaurus to Find Related Cell Lines

The Cellosaurus database of cell line information for 139,592 cell lines includes (as of August 24, 2022) the STR profiles for 8,159 human cell lines, 78 for mouse cell lines, and 36 for dog cell lines. These profiles can be compared using the CLASTR search algorithm of Robin et al. [69].

Figure 9 shows how human STR data are entered into CLASTR search template. The STR data can be entered manually, directly from the Cellosaurus database, or from a separate spreadsheet file. Three optional algorithms can be used with different filters for cutoff level, number of STR loci, and the number of returns. The result of searching for matches are shown in a table, Figure 10 shows the CLASTR search results for the HeLa-derived human cell line HEp-2 [HeLa] [48] (electropherogram data not shown).

Figure 9. . CLASTR Search Tool Input of Human STR Data.

Figure 9.

CLASTR Search Tool Input of Human STR Data. Date is for the HeLa-derived cell line HEp-2.

Figure 10. . CLASTR Search Results for Matches to STR Profile for the Human HeLa-derived HEp-2 Cell Line.

Figure 10.

CLASTR Search Results for Matches to STR Profile for the Human HeLa-derived HEp-2 Cell Line. Note that several matches have accession numbers in red font, which indicates these are misidentified cell lines. HEp-2 was one of many cell lines shown in 1968 (more...)

Figure 11 shows the CLASTR search template for mouse STR data and Figure 12 shows the search results using the STR data of the mouse cell line P19 taken from the electropherogram in Figure 6.

Figure 11. . CLASTR Search Tool Input of Human STR Data.

Figure 11.

CLASTR Search Tool Input of Human STR Data. Data is for mouse P19 cell line taken from Figure 6.

Figure 12.

Figure 12.

CLASTR Search Results for STR Profile for the Mouse P19 Cell Line

Common Features of Cell Line STR Profiles

A unique cell line will have a unique STR profile that is different from that of an unrelated cell line as illustrated in Figure 13.

Figure 13. . Electropherogram of Two Unrelated Human Cell Lines.

Figure 13.

Electropherogram of Two Unrelated Human Cell Lines. K562 (chronic myelogenous leukemia) and WS1 (skin fibroblast) were obtained from two unrelated individuals. STR profile is different between the two cell lines. STR analysis performed with PowerPlex (more...)

Figure 14 shows how cell lines derived from different tissues from the same individual have the same STR profile.

Figure 14. . Electropherogram of Two Related Human Cell Lines.

Figure 14.

Electropherogram of Two Related Human Cell Lines. HAAE-2 (human aortic artery) and HFAE-2 (femoral artery) were obtained from the same individual. STR profiles are identical between the two cell lines. STR analysis was performed using the Promega PowerPlex (more...)

However, data interpretations of STR profiles on human and mouse cell lines, especially those derived from tumor tissue, often present certain nuances such as:

  • Loss of heterozygosity (LOH), also known as allele drop-out (ADO)

  • Peak imbalance

  • Novel out-of-range alleles

  • Multiple peaks at several loci

Loss of Heterozygosity (LOH)

The incidence of genetic instability in STR markers used to evaluate human cancers is not uncommon [32, 58, 70]. Most human cell lines are derived from a cancer, which differ genetically from normal tissue. Moreover, cell lines are capable of undergoing additional genetic changes while in culture. In a study of 24 lung samples, there were complete deletions of alleles at multiple loci when compared to normal tissue [65]. Figure 14 compares the allelic STR profiles of two samples from the same donor. Several loci show some degree of allelic imbalances. Figure 15 compares the STR profile of the DNA from serum of the donor with the cell line MDA-MB-435S (M14) from this donor's melanoma and illustrates the loss of alleles at three STR loci.

Figure 15. . Comparison of STR Profiles of Patient Serum DNA vs Misidentified Cell Line DNA Showing Allelic Imbalance and Losses.

Figure 15.

Comparison of STR Profiles of Patient Serum DNA vs Misidentified Cell Line DNA Showing Allelic Imbalance and Losses. Comparison of partial STR profiles for DNA from serum from male melanoma patient M14 (top) and misidentified cell line MDA-MB-435S (bottom) (more...)

Furthermore, great care should be taken when evaluating STR profile of biopsy tissue (which may contain normal tissue) when comparing it to the STR profile(s) of cell line(s) derived from a tumor of that same tissue. The STR profile due to the presence of the normal tissue may differ from that of the cell line that is eventually established from the tumor.

Peak Imbalance

Many tumor cell lines are aneuploid with varying copies of individual chromosomes and total chromosome numbers exceeding 46. For instance, in the case of M14, a melanoma cell line that is often misidentified as a breast cancer cell line, has been reported to have on average 66 chromosomes [46]. ML14 is a lymphoblastoid cell line that was established by transfection of lymphocytes with Epstein-Barr virus (EBV) from the same donor as M14. EBV is known to cause cells to fuse, which may result in fused cells being tetraploid initially. With culturing, the cells start to slowly lose some of the chromosomes. A sample of ML40 has a near tetraploid complement of chromosomes with an average of 89 chromosomes (Figure 16).

Figure 16. . Metaphase Karyogram of the Lymphoblastoid Cell Line ML14 [46].

Figure 16.

Metaphase Karyogram of the Lymphoblastoid Cell Line ML14 [46]. Notice there are four copies of most of the human chromosomes, two versions of the X chromosome, and two marker chromosomes (M) of unknown origin. Image courtesy of Dr. Marileila Varella-Garcia. (more...)

Aneuploidy is also common in mouse cell lines where total chromosome numbers exceed 40. It has been shown in mouse embryonic stem cell lines [67, 76], and in a separate study Didion et al. demonstrated that 42% of normal and cancer mouse cell lines tested had evidence of aneuploidy [26]. The normal cell lines that contained chromosomal abnormalities confirmed that this is not just a problem found in cancer cell lines, but is caused from cells growing in culture. It is possible that fluctuating aneuploidy can affect the apparent allelic balances in STR profiles.

Peak imbalance is a common feature of cancer cells (see Figure 5, 13, 14, 16). The favored amplification of one allele over another may be due to genetic duplications, aneuploidy, or chimeric (mixed) cell populations [25, 32, 65] in which some cells in the culture have lost a copy of an STR locus. This characteristic brings a unique feature to cell line identification that is not commonly seen in DNA from normal tissue (see Figure 3) or forensic samples. Cell lines often show allelic imbalances, as illustrated by the CSF1PO, D5S818, D16S539 and TPOX loci of K562 in Figure 13, the D13S317 locus in the HFAE-2 profile in Figure 14, and TH01 locus in MDA-MB-435S in the bottom panel of Figure 16.

Novel, Out-of-Range Alleles

On rarer occasions, some peaks do not align with the allelic ladder and fall between loci ranges (see peak encircled in red, Figure 17). It may be a 15-repeat allele of the D7S820 locus from its position and such an allele is listed on the NIST website. The identity of such out-of-range peaks should be confirmed by PCR amplification of the alleles in question (i.e., D7S820 and D16S539 in Figure 17) by using monoplex PCR reactions with primers specific for the two loci (D7S820 and D16S539) to determine which primer pair produces the out-of-range allele.

Figure 17. . Examples of Allelic Imbalance and Novel Out-of-Range Allele.

Figure 17.

Examples of Allelic Imbalance and Novel Out-of-Range Allele. At the D5S818 and D16S539 loci (red arrows), an off-ladder allele between the D7S820 and the D16S539 loci, and stutter peaks. The red arrows point to stutter peaks in the D5S818 and D16S539 (more...)

Multiple Peaks at Several Loci

Another relatively common occurrence of STR typing of cancer cell lines is multiple peaks at several loci. Three or more peaks at one or two loci may be due to somatic mutation, trisomy, or gene duplications. The most common somatic mutations of STR loci are caused by microsatellite instability (MSI) due to defects in the cell's DNA mismatch repair (MMR) system. Events with more than three peaks at more than three loci may be due to cross contamination with cells from a different human cell line (see Figure 18 as an example for cellular cross contamination, and Appendix F and Figures H.5 and H.15 in the ASN-0002-2021 Standard [49]). In either event, an independent method should be used to eliminate cell line mixtures.

Figure 18. . Electropherogram of Cellular Cross-contamination.

Figure 18.

Electropherogram of Cellular Cross-contamination. Multiple peaks at D5S818, D13S317, D7S820, D16S539, vWA, CSF1PO loci (indicated by the red arrows). STR typing of human cell lines with PowerPlex 1.2. Courtesy of Reid et al. [68].

Several approaches to discern whether a cell line is cross-contaminated with another cell line or whether MSI is the cause of a gain of alleles at one more STR loci include:

  • Retrieve the earliest aliquot of the original batch of the cell line to see whether it also has multiple peaks. This would indicate whether the multiple peaks are a "feature" of the cell line or have arisen during subsequent culturing. Repeat culturing to see whether the additional alleles appear.

  • Attempt to subculture from a single, or as few as possible cells, and see whether such subcultures present different profiles or similar mixed profile. If the former, then the culture is probably a mixture of cells, if the latter, then the culture may be unstable due to a defective DNA MMR system.

  • If the additional shorter peaks only differ from the taller peaks by plus or minus single repeat units this suggests MSI is involved.

  • Using the Cellosaurus search tool CLASTR, search for potential matches to different cell lines using the Master versus Reference algorithm. This approach can potentially identify the profiles of the components of a mixture of two cell lines (see this discussion of the algorithms), and thus identify the cell lines in the mixture and the source of the extra peaks. Figure 19 shows the CLASTR search results for the purported thyroid cell line BHP18-21 (CVCL_6282) which has three or more alleles at 7 of 9 STR loci (query line). This suggests that the cell line is actually a mixture of two or more lines. On comparing the STR profile using the Masters vs Reference algorithm, it is clear that alleles of both the TPC-1 (CVCL_6298) thyroid cell line and the melanoma cell line M14 (CVCL_1395) are shared and that these two lines are derived from the same person and have cross-contaminated several other cell lines.

  • MSI causes gains or losses of single repeats in STR alleles as described by Parson et al. [64] and illustrated Appendix F of the ASN-0002-2021 Standard [49]. On comparing the STR profile with a reference profile, note whether the additional peaks differ by plus or minus single repeats (i.e., ± 4 bp) and using the Masters vs Reference algorithm determine whether the complex profile matches a single cell line or multiple different lines. If the profile shows only gains or losses by single repeats and the match is to a single cell line (including multiple variants of it), this suggests that the extra peaks are due to MSI. This can be confirmed by testing for MSI using either in-lab designed assay (see supplemental information in Korch et al. [46]) or a commercial kit (e.g., Promega MSI Analysis System, Version 1.2, Cat. No. MD1641) or described in Appendix F of the ASN-0002-2021 [49].

Figure 19. . Matches to Mixed "Thyroid" Cell Line BHP18-21 Using the Alt Masters % Match Algorithm.

Figure 19.

Matches to Mixed "Thyroid" Cell Line BHP18-21 Using the Alt Masters % Match Algorithm. This is a selection of the CLASTR search results. Note that the CVCL IDs for true cell lines TPC-1 and M14 are in blue font, while all others are in red, which are (more...)

Criteria for Determining Quality STR Profile Analysis for Reliable and Interpretable Results

Validation of Procedure

Method validation is the process of demonstrating that a laboratory procedure is robust, reliable, and reproducible in the hands of the laboratory personnel performing the test. The factors for method validation include precision, accuracy, limit detection, specificity, linearity, range, robustness, and system suitability. For more information on method validation of STR Systems, refer to ANSI-ATCC ASN-0002 2021 revision [49], STRBase Validation Information to Aid Forensic DNA Laboratories, and Promega’s Validation of STR Systems Reference Manuals here and here.

Setting Appropriate Analytical Peak Heights and Peak Thresholds

Each laboratory should determine the analytical threshold or the level at which valid signals are above baseline noise. Data falling below this threshold may not be suitable for allele calls; however, data above the threshold should be of sufficient quality to determine an allele call.

The genetic material for some tumor cell lines is very complex and this could lead to difficulty in distinguishing true low-level peaks from technical artifacts, including noise. There are no set rules for establishing threshold values; consequently, each laboratory must establish empirically peak-height thresholds for “calling” alleles as part of its validation process. Only peak-heights, expressed in relative fluorescent units (RFU), that exceed the analytical threshold value can be accepted.

The threshold may be determined experimentally on the basis of observed signal-to-noise ratios or be a predetermined value established by manufacturers (for the ABI 3500 the recommended cut-off level is 150 RFU; while that for the ABI 3730 or 3130 instrument is generally about 50 RFU) or through in-house validation of the available kit and instrument. The lower threshold is a measure of the procedure sensitivity. The upper threshold is essential when reviewing data from samples with high DNA quantities. Samples with high RFU values can oversaturate the instrument’s ability to detect the sample and lead to artifacts that can make it difficult to interpret the results. These include reducing the efficiency of the terminal transferase reaction to produce fragments with and without a terminal A (see Figure 24 for illustration of incomplete adenylation by the terminal transferase activity of the DNA polymerase). Figure 20 is an example of a real allele that is a 9.3-repeat allele at the D7S820 locus, which could be confused for incomplete adenylation of a 10-repeat allele. Additional artifacts include very intense peaks which can appear as split doublets and signal spill-over into a different fluorophore channel. Thus, these extra artifactual peaks may be incorrectly interpreted as real peaks (i.e. alleles) especially when working with samples that are in the early stages of cellular cross-contamination.

Figure 20. . Example of D7S820 Microvariant Allele.

Figure 20.

Example of D7S820 Microvariant Allele. On comparison of an off-ladder allele with the allelic ladder (top panel) at the D7S820 locus of the cell line SKMEL28. Based on its length it is a 9.3 allele, one base shorter than the 10-repeat allele. It should (more...)

Another factor for optimal display of the STR allele peaks is the size range of analysis. In the GeneMapper software this must be set correctly so as not to include the residual peaks that appear first at the beginning of each sample profile and each panel must be set to automatically adjust the peaks to fit optimally. This procedure is described in the 2021 revision of the ANSI-ATCC ASN-0002 standard [49] and may be described in the manuals for the analysis software being used.

Use of Appropriate Positive and Negative Controls

Including controls during the STR profile analysis is very important as it allows the technician to identify and troubleshoot possible problems, such as DNA contamination of the PCR reagents, and thus ensure whether the data are accurate and reliable.

During the amplification process, positive and negative controls are used. A positive control is a DNA sample with a known STR profile that is added to the sample set. The positive control confirms that the analysis processes are working accurately. A positive control DNA is usually provided in the manufacturer’s STR kits. A negative control, which could be a reagent blank with either water or buffer substituted for DNA, is also performed and treated exactly in the same manner as other samples. This allows the technician to determining if the reagents and/or techniques used may have introduced contaminating DNA.

Evaluation of Internal Lane Size Standards (ISS) and Allelic Ladders

All commercial STR kits include allelic ladders and internal size standards (ISS). Each sample during a run is assessed at completion for the correct calling of the ISS peaks. In general, the peaks from an ISS are uniform in size or intensity. In some ISS there may be peaks of different heights/intensities so that they can be easily identified. Lack of uniformity or miscalled peaks can indicate problems with the sample, injection, and/or run conditions. The ISS is valuable in determining the accuracy of a capillary electrophoresis run. Temperature variations during electrophoresis can cause the in-run precision to exceed 1 base pair and evaluation of the ISS can assist analysts in identifying this issue.

In the new ASN-0002-2021 [49], Figures 5.5, H.8a, and H.8b show examples of good, poor and unusable ISS profiles which prevent their use in profiling. Also, high amounts of sample can reduce the signal strength of the ISS, which is probably due to the increased amounts of salt in the sample which compete with the DNA amplicons for uptake (see Figures 5.5, 5.6, 5.7 in ASN-0002-2021 [49]).

Allelic ladders should be assessed to ensure that all peaks have been correctly called (top panel in Figure 20). Ladder peaks that have not been called or have been miscalled can indicate a problem with the ladder sample, injection, and/or run.

Detection and interpretation of off-ladder alleles and microvariants

Allelic ladder (AL) standards are a collection of labeled DNA fragments that represent the most common alleles at each locus and were established through the evaluation of STR and sequencing data from several hundred individuals. Alleles within the STR loci are known to differ significantly between individuals and the AL included in STR kits do not represent all possible known alleles. The bins of the allele calling software identifies known peaks that are in the range of alleles for a locus. Older versions of the allele bins data may not include more recently identified alleles. In this case, the software calls such unknown alleles within this range as off-ladder (OL) peaks / alleles. A well-known microvariant is the 13.3 allele of the D13S317 locus of HeLa cells (Figure 21). This allele, very infrequently seen in forensic and other cell lines, was often mis-called as a 14-repeat allele or an OL allele. This miscalling was probably due to the lower resolution of the slab-gel electrophoresis equipment used in the 1990s and the absence of a bin for this distinct allele. Peaks between loci ranges are usually not called (see Figure 17).

Figure 21. . Example of D13S317 Microvariant Allele.

Figure 21.

Example of D13S317 Microvariant Allele. The 13.3 off-ladder allele of D13S317 locus of HeLa cells, which has often mistakenly been called as a 14-repeat allele in earlier analyses.

While off-ladder alleles have been well documented with forensic STR testing, some may not have been previously characterized. The National Institute of Standards and Technology (NIST) website has a listing of human off-ladder alleles and can be used as a reference in these instances. If it is determined that an allele has not been characterized, it may be advisable to rerun the sample to confirm that it is not an artifact. If repeatable, unlabeled primers for the possible loci (D7S820 and D16S539 for the allele between these two loci in Figure 17 or for the D13S317 for the 13.3 allele in Figure 21) can be used to PCR amplify and sequence the amplicons in question to ascertain the cause of the OL size.

Microvariants are incomplete repeats at a particular locus and are expressed as a partial representation of the repeat motif. Generally, microvariants may be off by 1, 2, or 3 bases and are designated by the whole number of repeats plus .1, .2, or .3, respectively. Note that there are only these variations and there are not any alleles with the ".4 " designation, because most of the STR alleles in the various STR kits are tetranucleotide repeats and any allele with 4 extra nucleotides is the next full-size allele (i.e., a "14.4" allele migrates as a 15-repeat allele). There are exceptions to this as shown by the sequencing of the different alleles. This is discussed extensively elsewhere [19, 49] and on the NIST website.

Detection and Interpretation of Data Artifacts

There are several artifacts during STR profile analysis that can lead to misinterpretation of the data. Identification and resolution of these artifacts are explained in great detail in the 2021 revision of the ANSI/ATCC ASN-0002 standard [49].

Stutter peaks (products)

Stutter products are a common amplification artifact caused by strand slippage of the polymerase and is associated with STR analysis [40, 84]. Stutter products are often observed as one repeat unit shorter in length (N-1) than the true allele peak. Other stutter products (N-2 or N+1 repeats) if seen, are the exception. Frequently, alleles with a greater number of repeat units will exhibit a higher percent stutter. The pattern and intensity of stutters may differ slightly between primer sets for the same loci (see Figure 22 and either Promega’s manual on troubleshooting PowerPlex 18D System, or the manual from the manufacturer of the kit being used).

Figure 22. . Stutter Peaks.

Figure 22.

Stutter Peaks. Stutter peaks (red arrows) appear to run ahead of the main peaks (or true allele) by 1 repeat unit (4 bp). In cell lines, they can be mistaken for or co-migrate with real allele peaks. The 12-repeat allele in this figure is probably a stutter (more...)

Dye blobs

Dye blobs, which appear as broad peaks that are not as sharp as normal peaks, may occur in STR analysis (Figure 23). Evidence suggests that the fluorescent dye tags attached to the primers begin to break down over time. Disassociated primer dyes can appear in the sample analysis range and mask true data. Dye blobs are usually wider than real peaks and typically only seen in one color. The example of a dye blob in Figure 23 is probably evident because the signal strength of the peaks in the panel are relatively weak; with stronger peaks the dye blob might have been hidden in the background noise. Follow the manufacturer’s specifications for storage of amplification kits to avoid problems associated with free dyes. If problems persist, clean-up or re-amplify sample.

Figure 23. . Dye Blob.

Figure 23.

Dye Blob. Note, it produces a broad peak. Courtesy of Reid et al. [68].

Incomplete adenylation of PCR amplicons

Non-template addition occurs when Taq DNA polymerase adds an additional base, usually adenine (A), to the 3' end of the amplicon during the PCR amplification process [18]. If incomplete adenylation occurs, split peaks, which differ in length by one base, are evident during CE analysis and result in an amplicon with +A and -A peaks (Figure 24). This type of artifact leads to broad or double peaks and makes it difficult for the software to accurately call the real allele. It is important to note that some authentic microvariant alleles at certain loci can migrate at the same position as these incomplete adenylation peaks and thus produce confusing results. They can look similar to the double peaks in Figure 20, but the double peaks due to incomplete terminal A addition will be evident at several loci and not just at one locus. The use of excess genomic DNA and / or too short of a final extension time in the PCR can result in the elevated levels of the N-1 base stutter peak. In such cases it can be difficult to distinguish this peak from a true allele. Therefore, use of validated commercial kits approved for forensic analyses, accurate quantification of the DNA sample, and optimization of the final extension times will ensure optimal STR peak formation.

Figure 24. . Schematic of Incomplete Adenylation of PCR Amplicons.

Figure 24.

Schematic of Incomplete Adenylation of PCR Amplicons. Schematic of non-template nucleotide addition shown (A) with illustrated size difference. (B) Some DNA polymerases add an extra nucleotide beyond the 3′-end of the target sequence extension (more...)

Dye pull-up or bleed-through

Dye pull-up, sometimes referred to as bleed-through, represents a failure of the analysis software to discriminate between the different dye colors used during the generation of the data. Oversaturated data due to excessive signal strength can cause the dyes to “bleed” over or pull-up into another color space. If pull-up occurs, inject less of the sample, dilute the sample and reinject it, or re-amplify the sample with less DNA. Reoccurring pull-up (due to too much DNA) may indicate that the quantification method or the amount of DNA used for amplification should be reevaluated. If the peaks are very high due to too much DNA, there are three approaches that can resolve this issue. The PCR reactions can be diluted by either adding additional volumes of formamide or diluting the PCR reaction before mixing an aliquot with formamide. Alternatively, the injection time and / or voltage could be reduced and the samples re-run on the capillary electrophoresis instrument. The latter may require assay validation by the laboratory.

If DNA was accurately quantified and the concentration is in the target range it may be necessary to perform a new spectral calibration of the capillary electrophoresis instrument and re-run the samples.

In some cases, even after the ideal amount of DNA was processed for STR genotyping, other peaks arise that are not artifacts, but are actual alleles that fall outside of bins (or known alleles) in the data analysis software. These peaks are called off-ladder alleles; see above for discussion of these type of alleles.

Services for STR Genotyping of Cell Lines

Over the past few years, several institutions have started offering STR typing of human cell lines. When choosing a testing laboratory, considerations should be made based on experience of testing laboratory personnel to perform and interpret the data from STR analysis. The following are some types of institutions who currently offer STR typing services.

  • Cell Banks

  • Commercial genetic identification laboratories

  • Paternity testing labs

  • Universities

  • Core labs

Troubleshooting

The revised ANSI-ATCC ASN00002-2021 standard for cell line authentication contains extensive troubleshooting guidelines [49]. Below are three sets of best practices and troubleshooting suggestions to assist in obtaining optimal STR genotyping results.

Best Practices for Avoiding Misidentified and Cross-contaminated Cell Lines

Below are some procedures that can help reduce usage of cross-contaminated and misidentified cell lines:

  • Good documentation and tracking of cell line handling using in-house programs or the cell line tracking and inventory software, such as Find Cell by Find Genomics.

  • Consult the Cellosaurus and the ICLAC websites for descriptions of misidentified and false cell lines to avoid using them.

  • Obtain cell lines from reputable, cell line repositories which authenticate their cell lines. Beware of unreliable sources of cell lines which have simply cut-and-pasted cell line descriptions from the websites of the well-known repositories.

  • If generating cell lines etc. within your laboratory, freeze samples (biopsy, blood or serum, buccal smear, xenografts, FFPE) in liquid nitrogen for eventual STR genotyping and further genetic studies (e.g., next generation sequencing).

  • Quarantine cell lines upon receipt and test to ensure they are mycoplasma free. Perform the Cooper et al. PCR test for common mammalian species in sample [23].

  • Train all laboratory personnel who handle cell lines on proper cell handling techniques as described in these publications: [20, 22, 35, 38].

  • Write up standard cell line handling and testing protocols to be followed by all lab personnel, including how to interpret STR profiles and the significance of STR profile match scores.

  • Stress the use of good aseptic techniques.

  • Devote one reservoir of medium for each cell line.

  • Aliquot stock solutions/reagents.

  • Label flasks (name of cell line, passage number, date of transfer (use barcoded flasks when available).

  • Work with one cell line at a time in biological safety cabinet.

  • Handle the slower growing cultures before the faster growing cultures.

  • Clean biological safety cabinet between each cell line.

  • Allow a minimum of 5 minutes between each cell line with the laminar flow fan running.

  • Quarantine “dirty” cell lines (i.e., ones that have not been tested for the presence of mycoplasma and have not been authenticated for their identity) separately from “clean” cell lines before using them.

  • Manageable work load to reduce accidental mixing-up of cell lines.

  • Clean laboratory regularly to reduce bioburden.

  • Legible handwriting (printed labels) to avoid mislabeling of samples.

  • Routinely monitor for cell line identity and characteristics to check for potential contamination.

  • Use seed stock (create master stocks) see Figure 1 and Figure 7.

  • Create a clean, orderly working environment, which includes ensuring tissue culture hoods are neither cluttered nor used to store equipment and reagents.

  • Review and approve laboratory notebooks frequently by the laboratory principal investigator.

  • Authenticate cell lines regularly and a) after selection of cells (antibiotic selection or drug resistance), b) when unexpected phenotypes are observed, c) at end of a project, and d) before submission of grant applications and manuscripts. See 20 steps in the suggested Cell Line Policy available on the ICLAC website (see ICLAC links below).

Preventing Contamination During PCR

This section is for the laboratory performing the PCR and electrophoretic steps of the STR analysis. Preventing contamination during PCR is of critical importance to ensure that you are getting meaningful results. The

following list provides some suggestions of how to reduce and/or prevent contamination during PCR.

  • Separate pre-amplification (low copy) space from post-amplification (high copy) space.

  • Use separate lab coat, gloves, tubes, pipette tips in pre-amplification room from post-amplification room.

  • Use aerosol-resistant pipette tips.

  • Use a different pipette tip for each pipetting (even of the same reagent) when pipetting all your reagents, even the same master mix to each tube.

  • Keep pre-amplification and post-amplification reagents in separate rooms.

  • Prepare amplification reactions in a room dedicated for reaction setup.

  • Use a separate aliquot of molecular grade water stock for each round of PCR*.

  • Prepare your PCR mix in a hood with laminar flow. Decontaminate it with 10% bleach, 70% ethanol, and/or solutions for decontaminating surfaces of RNases and DNases (e.g., RNaseZap or RNaseAway or NucleoClean decontamination spray).

  • Keep your tubes closed during the procedure, even your master mix tube**.

  • Be sure that your tubes are closed when discarding the pipette tip!!! Aerosols are dangerous!!!

  • Open the tubes only when necessary.

NOTES

* Molecular-grade and HPLC-grade water are usually DNA-free and do not pose the risk of residual DEPC in these water sources that could inhibit the PCR. Working in a PCR hood, aliquot these waters into sterile 15 and / or 50 mL polypropylene capped centrifuge tubes for ease of handling and minimization of contamination. Test aliquots for purity before use in PCR experiments.

** For the PCRs, individual PCR tubes with attached domed caps can be used instead of 96-well plates for setting up the PCRs. The caps can be left partially closed between each addition and then fully closed after the final addition. Although strip caps are available for 96-well plates, they are not easily handled in a manner that would avoid sample mix up and cross-contamination.

Troubleshooting of PCR Step

When first validating the PCR of STR alleles and on occasions when the PCR amplification of STR alleles may fail to produce any amplicons detectable by capillary electrophoresis, a useful troubleshooting technique is to analyze the yield of PCR reactions by agarose gel electrophoresis in the laboratory. This will be more rapid and less expensive than submitting samples to a facility for analysis.

Causes of unsuccessful PCRs, besides omission of a PCR reagent or an aliquot of sample DNA, include insufficient gDNA in the sample due to inaccurate DNA quantification or the DNA is not from the correct species for the STR genotyping kit.

To troubleshoot the multiplex STR PCRs, electrophorese an aliquot (1/3 or ½) of the reaction on a 1.75% - 2.0% agarose gel made and run with 1X TBE buffer containing 0.3 µg of ethidium bromide per mL of buffer. The image in Figure 25 shows an agarose gel with No DNA control reaction and the positive control DNA 9947a in the bottom half of the gel and results of PCR amplification of STR alleles from DNA isolated from seven cell lines in the top half of the gel. Note that the No DNA control shows only the presence of the dye labeled primers; whereas, the other samples show multiple, closely migrating bands above the dye labeled primers.

Figure 25. . Agarose Gel of STR Profiling Reactions.

Figure 25.

Agarose Gel of STR Profiling Reactions. Lane ML is the Bioline DNA 1 kb HyperLadder mass ladder in both the top and bottom set of lanes. Samples 1-7 are aliquots of the STR profiling reactions from 7 different cell line DNA samples. 9947a is the STR reaction (more...)

Non-commercial Kits and Inappropriate Authentication Methods

Commercial vs Home-brew STR Genotyping Kits

Commercially available kits for human STR genotyping are recommended due to the extensive validation they have undergone on behalf of the forensics community. Some laboratories have developed their own STR assays [56]; however, they have not undergone as rigorous testing as the commercial kits and may not produce high quality results. Therefore, commercial kits are preferred and lists of different kits are available on the NIST website and in the 2021 ASN-0002 revised edition [49].

Inappropriate Authentication Methods of Cell Lines and Tissue Samples

Korch and colleagues [46, 47], and Appendix D in the 2021 revision of the ASN-0002 standard [49] discuss both earlier methods of authentication of cell lines and tissue samples as well as inappropriate methods that have been too frequently used in the literature. The two most common invalid methods are by visual examination of cellular morphology and/or transcript or protein expression. Visually it is very difficult to distinguish between many cell lines and all of these types of observations can be strongly affected by growth conditions.

Next-generation transcriptome comparisons [34] can be useful for monitoring differences between aliquots of samples of the same cell line to ascertain whether the data can be usefully compared and whether an unknown cell line has contaminated a culture. However, such sequence data are not useful to identify cell lines using the Cellosaurus or similar databases which use only STR data based on genomic DNA. Also, in the case where a copy of a gene is not expressed for any number of reasons, the % match from transcriptomic date will be lower than that obtained from STR genotyping of genomic DNA.

Glossary of Terms

Please refer to the “Genetics Terms” section of the “Glossary of Quantitative Biology Terms” for a list of terms and definitions used in this chapter and to the Glossary of terms in the 2021 revision of the ASN-0002 Standards [49].

License

All Assay Guidance Manual content, except where otherwise noted, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported license (CC BY-NC-SA 3.0), which permits copying, distribution, transmission, and adaptation of the work, provided the original work and any copyrighted figures and data are properly cited and not used for commercial purposes. Any altered, transformed, or adapted form of the work may only be distributed under the same or similar license to this one.

Acknowledgements

We would like to thank Drs. Yvonne A Reid, Wilhelm G Dirks, and Amanda Capes-Davis for their suggestions and comments on an earlier draft of this AGM chapter and Drs Reid, Douglas Storts, and their colleagues for permission to use some figures from the 2013 AGM chapter [68] for this 2023 AGM chapter describing the authentication of cell lines by STR genotyping. We are also grateful to Dr. Marileila Varella-Garcia for use of her karyotype of the lymphoblastoid cell line ML14.

Useful Resources

The International Cell Line Authentication Committee (ICLAC) maintains a register of misidentified cell lines. See Step 1 for additional information about this organization. Links to additional guidelines for handling and authenticating cell lines include:

See this list of commercially available STR Multiplex Kits and the 2021 Revision of the ANSI-ATCC ASN-0002, Appendices A and B [49].

The Find Cell program by Find Genomics allows maintaining a database of cell lines being used in a laboratory [86].

Links to Cell Line Repositories with Human STR Genotypes and Search Tools

This list is derived from Chapter 6 and Appendix I of the 2021 revision of the ANSI-ATCC ASN-0002 Standard [49].

Источник: https://www.ncbi.nlm.nih.gov/books/NBK144066/