Introduction

The classification of bone tumor lies in the basis of the cell type from which the tumor originates. Aside from the bone, a neoplastic growth in bone tumors may also originate from the surrounding soft tissues, muscles and ligaments. Benign bone tumors, which are more common than malignant tumors, include osteochondroma (derived from cartilage tumor), giant cell tumor (GCT), osteoid osteoma and osteoblastoma. The most commonly diagnosed malignant bone tumors, or often termed “sarcomas,” are osteosarcoma (derived from osteoblastic cells), Ewing sarcoma (derived from round cell from bone marrow) and chondrosarcoma (derived from cartilage tumor). Some tumors may also develop from the soft tissues such as fat, muscle, nerves or blood vessels. Rhabdomyosarcomas, neurofibrosarcoma and angiosarcomas are a few examples of soft tissue sarcomas, whereas lipoblastoma and neurofibroma are benign soft tissue tumors.

Compared to most other tumors, a bone tumor can manifest pain early and is usually accompanied by local swelling, fever and spontaneous fracture. Plain radiographs can be used to detect bone tumor in the initial diagnosis and also suggest the aggressiveness of the tumor. This step is usually followed by staging studies that can be carried out using various methods, including bone scintigraphy, computed tomography scan, positron emission tomography scan or magnetic resonance imaging. However, biopsy must be performed for confirmation of malignancy as well as to decide on whether surgery is required for the bone tumor patients (1). If it was, the surgeon may suggest different types of surgery depending on the size and location of the tumor such as resection (removal parts or the bone affected), curettage (scra** out the tumor without removal of the bone; usually used for benign tumor) and limb salvage surgery (removal of the cancer but still leaving some part of the limb for endoprosthesis). If the latter is not possible, amputation may be needed, which will definitely affect the normal function of the limb and, subsequently, the quality of life of the patient.

Currently, there are no specific markers that can be used to diagnose tumors of the bone. Biomarkers for early identification of the disease are greatly needed to reduce the mortality and increase limb salvage strategies (2). Early detection of either recurrent or metastatic disease can also prompt initial decision and action to treat the tumor, which may improve patient prognosis (3). Robbins and Kumar (4) had suggested that the elevated level of serum alkaline phosphatase released through osteoblastic activity in the tumor may be used as an indicator. However, the enzyme was previously reported to be high in patients with diseases of the liver, thus making it unsuitable for use as a bone tumor marker (5,6). Furthermore, this marker may be more useful for monitoring progression of bone tumor but not as its diagnostic marker.

A biological marker (biomarker) is a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacologic responses to a therapeutic intervention (7). It can be proteins, deoxyribonucleic acid (DNA), ribonucleic acid (RNA) or even metabolites. Biomarkers can be grouped into four main categories depending on their intended applications. First is a diagnostic biomarker, which can be used as a tool in identifying diseases or abnormal conditions in patients. On the other hand, a prognostic biomarker is an indicator that can predict tumor behavior and act as a marker of disease prognosis. The third type of biomarker is one that can be used in staging the disease, for example, in cancer. And finally, predictive biomarkers can be used to predict and monitor the clinical response after a treatment (7).

Biomarkers can also be classified according to their sources and functions, as explained by Baron in 2012 (8). The first group is carcinogenesis biomarkers, which are products of the neoplastic process directly produced by the tumor itself (for example, mutated or hypermethylated DNA). The second group is response biomarkers, which are generated when the body responds to the presence of cancer (for example, antibodies, protein degradation products and acute-phase reactants). The third group is released biomarkers, which include physiological molecules that are released in abnormal amounts after anatomical or metabolic disruptions of carcinogenesis (for example, blood in the stool or prostate-specific antigen in serum). Lastly, risk biomarkers, which consist of molecular markers associated with or supporting the carcinogenesis (for example, increased hormone levels).

Biomarkers should ideally possess certain characteristics to make them clinically valuable. These characteristics include being easily measured, reliable and detectable using a cost-effective assay without loss of analytical sensitivity or specificity (9). Until 2013, Fuzery and coworkers listed 23 protein cancer biomarkers that had been approved by the U.S. Food and Drug Administration (FDA), the majority of which are for breast cancer (10). The other biomarkers are for testicular, pancreatic, ovarian, colorectal, thyroid, prostate and bladder cancers. Although a lot of research has been carried out to identify biomarkers for bone tumors (Tables 14), none of the proposed candidate has so far been approved by FDA for clinical settings.

Table 1 Potential biomarkers for osteosarcoma.
Table 2 Potential biomarkers for Ewing sarcoma.
Table 3 Potential biomarkers for chondrosarcoma.
Table 4 Potential biomarkers for GCT.

Serum or plasma from blood and urine are the most frequently used samples in biomarker research because they are the easiest to sample from patients, and the method to obtain them is considered less invasive compared with tissue biopsy. These samples are well known to reflect various physiological and pathological states of the human body. There had been many biomarker studies that used tissue samples. However, these are not suitable for screening or early detection of diseases because the tissues usually came from patients who already had symptoms of the tumor (11).

Discovery Strategies

Diagnostic, prognostic and predictive biomarkers are most sought after in cancer research due to the urgent need in achieving a better clinical outcome for the patients (12). Presently, genomics and proteomics technologies are the two most widely used approaches in biomarker discovery strategies.

Genomics Technologies

The word “genomics” was first coined by Thomas Huston Roderick, a geneticist at The Jackson Laboratory, Bar Harbor, Maine, in 1986. The term can be defined as the study of an organism’s entire genome (13). Discovery of genomics biomarkers can usually be achieved using several approaches, including DNA sequencing, RNA expression and microRNA (miRNA) profiling, as well as epigenetic studies (14).

DNA sequencing is the most common approach used to identify genetic mutations in candidate genes, and it is also an important method in analyzing chromosomal rearrangement (that is, deletion, duplication, inversion and translocation of the chromosomes). The discovery of these genetic biomarkers usually starts with sequencing of exome (that is, the coding region of the human genome formed by exons) and/or whole genome sequencing. This step is usually followed by the validation phase, typically by microfluidic Sanger sequencing technology. Today, Sanger sequencing technology has been supplanted by next-generation sequencing methods, which can be used for a larger scale and automated genome analyses and also suitable for cross-platform validation at a much lower cost (15).

The discovery of epigenetic biomarkers stems from the ability of the DNA to undergo epigenetic modifications to the genome without any changes to the primary DNA sequence (16,17). Epigenetic alterations that are believed to be the causal events in cancers include DNA methylation, histone modification (methylation and acetylation), chromatin remodeling and regulation of noncoding RNAs (17,18). DNA methylation is the most extensively investigated as an epigenetic biomarker. In his review, Bock had outlined a systematic approach in the discovery of an epigenetic biomarker (18). A candidate differentially methylated region (DMR) is often mapped using bisulfite sequencing with the aid of computational tools, which is considered as a gold standard method in validating DNA methylation profiling (18,19). The candidate DMRs are then tested using medium-scale customizable methods such as microarrays or hybrid-selection sequencing. The selected top candidate DMR region-related genes are validated in large independent cohorts before entering the assay development and clinical trial phase.

In the case of expression biomarkers, which are derived from genes or RNA expression studies, genomics microarray is most frequently used because it is a powerful platform that can simultaneously measure the expression levels of thousands of genes. Coupled with hierarchical clustering algorithms, the data analysis that leads to classification of the gene expression can suggest candidate biomarkers that have diagnostic, prognostic or predictive values. To validate the microarray data, quantitative reverse-transcription polymerase chain reaction (qRT-PCR) is the most common method used and is considered the “gold standard” protocol in biomarker research (20).

Proteomics Technologies

The term “proteome” was first coined by Marc Wilkins (21) and refers to the total protein complement of a genome. The term “proteomics” refers to the analysis of the protein complement of the genome (22). In cancer research, proteomics has been widely applied in profiling the expression of proteins in cancer patients using various types of samples, including blood (serum and plasma), urine, cerebrospinal fluid, tear, saliva and tissues.

A typical proteomics experiment usually involves separation and isolation of proteins from a sample, acquisition of their structures and their characterization, and finally utilization of annotated databases to identify the proteins. Compared to genomics approaches, proteomics offers wider avenues for research. This result is because the genome of an organism is almost fixed, whereas proteome is always changing with time and from cell-to-cell, besides frequently being subjected to posttranslational modifications.

The most common techniques used to separate and isolate proteins are one- and two-dimensional gel electrophoreses (1-DE and 2-DE). In 1-DE, proteins are separated based on their molecular mass, and it can be used to separate proteins with molecular mass of ∼10–250 kDa. However, it has a limited resolving power, especially for more complex mixtures such as neat serum and crude cell lysate. Instead, 2-DE can be used, since it can separate the proteins according to their net charge in the first dimension and to their molecular mass in the second. This step provides a much better resolution for complex mixtures compared with 1-DE. The separation according to these two properties enables this method to also resolve proteins that had undergone posttranslational modifications (23,24). 2-DE has also been modified to generate its variant, two-dimensional difference gel electrophoresis (2D-DIGE), where different protein samples can be resolved simultaneously in a single gel. Each protein sample is labeled with different fluorescent dye having a different excitation wavelength, and the gel is scanned at the corresponding emission wavelength to generate images of the individual samples. This method allows introduction of labeled internal standard and therefore minimizes inter-gel variations. Nevertheless, there are some limitations to electrophoresis strategy for protein separation. Besides being time-consuming and laborious, a 2-DE experiment usually cannot detect proteins of low abundance, especially if their molecular weights are low. High-abundance proteins such as albumin and immunoglobulins always mask the presence of these proteins. The depletion of these high-abundance proteins can increase sensitivity to the detection of lower-abundance proteins. However, depletion of albumin, for example, may cause the loss of several proteins and cytokines including those that are currently used as biomarkers because of their interaction with the albumin itself (2528).

The complexity of a sample such as serum also makes the gel-based proteomics analysis a daunting task. To overcome these limitations, other proteomics alternatives can be used, such as surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS) or a combination of liquid chromatography (LC) and MS. These techniques can analyze complex mixtures with a simpler work flow and enables a greater number of samples to be analyzed. In addition, enrichment steps can also be performed to samples, to enable a study of a less complex subproteome. For example, lectins that are structurally diverse carbohydrate-binding proteins of nonimmune origin that have high affinities toward carbohydrate ligands are suitable for enrichment of the glycoproteome fraction. The study of a patient’s glycoproteome fraction may enable further understanding of the dynamic changes in the glycoprotein profiles in cancer patients compared with healthy controls (2932).

Profiles obtained from low-molecular-weight serum protein analysis using SELDI-TOF MS has been suggested to reflect the pathological state of organs and aid in the early detection of cancer (3335). SELDI-TOF MS, first introduced in 1993 by Hutchens and Yip (36), is most valuable in profiling of low-molecular-weight peptides (<20 kDa), which cannot be achieved by LC-MS and 2-DE (37). This novel approach combines both retention and MS-based methods on a relatively simple principle. Because of the unique surface chemistries of the chips, the system can be exploited to cater the use of different types of samples such as serum, plasma, urine and cell lysates. This platform has been successfully used to profile serum proteins and has been used in the discovery of potential candidates of biomarker or proteomics patterns for lung cancer (38), renal cancer (39), endometrial cancer (40) and gastric cancer (41). However, it is incapable of directly carrying out the sequence-based identification of the discovered discriminatory peaks (42). Identification of the resolved peaks must be performed using other proteomics work flow, for example, purification of the proteins of interest through chromatography or gel electrophoresis, followed by MS analysis (39,40).

LC-MS involves physical separation and mass analysis of peptides, making it a very powerful analytical chemistry technique with high sensitivity and high specificity. The use of LC-MS allows for the “shot-gun proteomics” approach, a term used when an entire protein sample is subjected to proteolytic digestion yielding a highly complex mixture of peptides. The recovered peptides are subjected to the LC system, and in a typical reverse-phase single-dimensional LC, the separation of the peptides in an LC column is based on hydrophobicity. The resulted elution is ionized before mass determination using MS. In the case of tandem MS (LC-MS/MS), the recovered peaks are further fragmented and analyzed by a second MS. The LC-MS itself is incapable of determining the amounts of proteins present in the sample. However, an approach using Isobaric Tags for Relative and Absolute Quantitation (iTRAQ) can be used to comparatively quantify the proteins. This chemical labeling method incorporates stable isotopes into an amine tagging reagent before identification of the proteins using MS (43). Currently, there are two sets of amine-reactive isobaric tags available in the market, the 4-plex and 8-plex, which allow the labeling of from four and up to eight samples simultaneously. The tags are used to derivatize peptides at the N-terminus and the lysine side chains, therefore labeling all the peptides in a digest mixture (43). Upon fragmentation in the second MS (MS/MS), these tags will give rise to unique reporter ions (m/z) that can be used to quantify the respective samples.

High-throughput relative protein quantification and identification can also be done using isotopic labeling of proteins or peptides and by means of label-free quantification of derived mass spectra. In the label-free LC-MS method, the amount of peptides can be determined using the ion signal intensities in the sample, where the data are collected in full MS scan mode to reconstruct the elution profile of the ions, thus producing the extracted ion chromatogram (XIC) (44,45). The abundance of the analyte can be determined relatively between two samples, which is calculated on the basis of the differences between the areas of the XICs of the two ions with the same mass (46). The labeling of these ions with isotopes can provide a more accurate result, but the cost involved is much higher than the label-free method, and complex sample processing for this method can lead to sample loss.

If one has already had the targeted molecules through the discovery phase, selective ion monitoring techniques can be applied for absolute quantitative measurements. Selection of appropriate and unique parent/product ion pairs for the analytes of interest using a powerful triple quadrupole mass spectrometer is known as single reaction monitoring (SRM) or multiple reaction monitoring (MRM), depending on the number of target ions screened (44,47). This nonscanning technique, unlike the shotgun approach, is highly selective and sensitive, which allows the researcher to direct the instrument to specifically monitor and do absolute quantification of target peptides or proteins of interest, even of low abundance, in a complicated matrix (44,48). MRM, which offers a rapid and specific quantification assay without the use of any antibodies, has a huge potential to be used as a biomarker validation tool (48). It has been used to successfully quantify 45 proteins in the human plasma, which includes 31 putative biomarkers for cardiovascular diseases (49). In a more recent study, Sung et al. (50) developed a high-throughput MRM assay to quantify and differentiate different isoforms of serum amyloid A (SAA), a putative protein biomarker for patients with lung cancer.

Biomarker for Bone Tumors from Genomics and Proteomics Studies

Today, a large number of candidate biomarkers for bone tumors have been proposed via various genomics and proteomics studies. Biomarkers for osteosarcoma, being the most common type of bone tumor, are the most extensively sought after by researchers.

This is followed by biomarkers for Ewing sarcoma and chondrosarcoma.

Most of the genomics biomarkers proposed are of the prognostic and predictive types, and a number of studies have listed a series of multigene classifiers or signatures to correctly classify or identify various types of bone cancer, as opposed to single protein markers derived from proteomics approaches. Focus was also toward discovery of miRNA biomarkers. In this case, almost all analyses were on tissues, with the sole exception of a study by Yuan et al. (51), which had used serum samples in the search for a potential miRNA marker for human osteosarcoma. Compared to tissue studies, substantial amounts of sera were used to isolate sufficient miRNA for the study.

On the other hand, the majority of biomarkers uncovered by proteomics are of the diagnostic type. The most common proteomics platforms used in biomarker research of bone tumor were 2-DE and 2D-DIGE. Recently, gel-free platforms such as SELDI-TOF MS and LC-MS have also been popular among proteomics researchers, mainly due to the requirement of small amounts of samples and the possibility of quantification analysis. Gelfree platforms offer additional options for automations and are known to be less laborious compared with 2-DE.

Potential Biomarkers for Osteosarcoma

Osteosarcoma, also called osteogenic sarcoma, is the most common type of primary bone cancer affecting children and adolescents (52). This tumor arises from mesenchymal cells and is characterized by osteoblastic differentiation of the neoplastic cells. The precise etiology of osteosarcoma is essentially unknown. Several reports, which were published as early as 1972, had already suggested that viruses such as human osteosarcoma virus (53) and Moloney murine sarcoma virus (54,55) can induce osteosarcoma. Other possible causes or initiating factors of osteosarcoma include chemical agents and radiation (56). In 2006, Bassin and co-workers reported an association between the incidence of osteosarcoma and fluoride exposure in drinking water during childhood (57). However, their finding is only consistent among male subjects.

A vast number of biomarker studies of osteosarcoma had proposed a series of miRNA fingerprints and multigene classifiers as signatures (Table 1) that may be used to reflect its pathogenesis and response to chemotherapy in patients (5861), although single-gene miRNA biomarkers such as C7orf24, miR-21, miRNA-214 and a gene that codes for tenascin-C protein had also been suggested (51,6299). Unlike proteins, genomics information carried by the DNA in an organism is stable over the entire lifetime to create and regulate proteins necessary for the cell structure and function. RNA is also stable in various types of samples, for example, miRNA is stable in formalin-fixed tissue and blood (100). However, the discovery of genomics biomarkers usually requires invasive procedures, since a fresh tissue specimen from a primary tumor is deemed the most suitable starting material. In this context, proteomics is a better choice for the discovery of biomarkers because it can use various types of bodily fluid samples, which can be obtained in less invasive ways.

A single gene can code for multiple proteins and hence increases their diversity. As most proteins undergo posttranslational modifications, these modifications also result in increased complexity of the proteome. The information of the physiologic changes mediated by these posttranslational modifications will not be available at the nucleic acid level, making proteins more dynamic and reflective of the cellular physiology compared with DNA or RNA (101). The structure and availability of the finalized protein decisively determine the cell behavior and, therefore, high-throughput screening for changes in protein expression is more suitable for discovery of prognostic or predictive biomarkers (102). However, due to the complexity of proteins and posttranslational processing, their analysis often proves to be a daunting task.

The technological advances in small molecule separation and identification, derived especially from proteomics, open the possibility for metabolites to be studied. Metabolomics, a more recent scientific field compared with proteomics and genomics, is defined as “the systematic study of small-molecule metabolites and their changes in biological samples due to physiological stimuli or genetic modification” (103). The most common methods used to study metabolites (both separation and identification) in biological samples are gas chromatography, capillary electrophoresis, high-performance liquid chromatography, ultra-performance liquid chromatography, MS and nuclear magnetic resonance (104). Because metabolites are related to functional phenotypes expressed in cells, tissues and organisms, metabolite biomarkers could be a good complementary to the genomics and proteomics biomarkers. However, metabolomics possesses one critical advantage over the other “omics” technologies, where each metabolite has the same basic chemical structure, regardless of the type of the organism, and therefore readily transferable from one species to another (105,106). The other favorable features of metabolomics for biomarker discovery have been elaborated elsewhere (106). Recently, a group of researchers have described a possible mean to forecast risk of breast cancer (up to 5 years) among women using plasma metabolomics and biocontour, with higher sensitivity and specificity than mammography (107). This metabolic forecasting of cancer may provide new insight in cancer etiology, besides being useful for early detection of the cancer (107).

Challenges in Biomarker Discovery and Implementation

Despite numerous reports of potential biomarkers uncovered from genomics and proteomics studies, the fact remains that these markers are still far from making their way closer to the patients. In the case of bone tumor, none of the potential biomarkers has currently been applied in clinical use. Genomics and proteomics technologies have produced close to 100,000 articles on biomarkers combined (PubMed search on April 15, 2015, keywords “proteomics biomarker” and “genomics biomarker”), but out of these, less than 100 managed to be validated for clinical use (108). In clinical practice, clinicians usually depend on numeric cut-off points when evaluating tumor markers. However, research in biomarkers usually proposed a set of genomics or proteomics signature, or fingerprint patterns, instead of a single biomarker, in distinguishing disease from normal conditions. The downside of this is that such patterns may at times be due to factors presented during collection of samples, such as the analysis of lipemic and hemolyzed samples, varying icteric index in the case of hyperbilirubinemia, freeze and thaw cycles, storage conditions, association with menstrual cycle, and diet and drug use, and not from presence of the cancer itself (109).

Variation of results is also known to exist from independent genomics investigations to find new prognostic gene signatures (110). The possible causes identified were poor study design, lack of a standard technology platform, nonstandardized sample collection procedures, difference in statistical method applied in each study and differences in input cohorts for the study (110). Such incoherence is also present in the field of proteomics. For example, in 2005, Baggerly et al. (111) raised controversy by questioning the reproducibility of a report by Petricoin et al. (34). In the study, the SELDI-TOF MS platform was used to correctly classify and discriminate all the ovarian cancer cases from nonmalignant disorders. Based on the data that were made publicly available by Petricoin’s group in a website in 2004, Baggerly et al. (112) reexamined the reproducibility of the work and concluded that “much of the structure uncovered in these experiments could be due to artefacts of sample processing, not to the underlying biology of cancer.”

The question of why protein biomarkers for cancer usually failed to reach the clinic has been attributed to three main reasons (113). First is fraudulent publications, which are actually quite rare (114). Second is the inability of the biomarkers to meet the demands by clinics due to low specificity, low sensitivity and low prognostic/predictive value, despite being successfully validated. To obtain approval from the FDA, clinical trials have to be conducted for the proposed biomarkers, but the expensive cost and various organizations that need to be involved sometimes impede decisions to further bring the potential biomarker into the clinics. Third is false discovery, where some biomarkers that initially look promising fail to make it through because of preanalytical, analytical, postanalytical and bioinformatics shortcomings at either discovery or the validation phase (113).

Several solutions have been proposed to address these shortcomings in an attempt to bring the biomarkers to become clinically useful. Diamandis (115) emphasized the need for biomarker scientists to possess the required analytical and clinical credentials other than experiences from the qualitative field of science to successfully convey the biomarker’s benefit to the patients. A standard stratagem that outlines the various phases of biomarker studies, from the discovery strategies through the validation phases toward the intended use of the discovered biomarkers, has to be implemented (116,117). Finally, a well-validated biomarker, which may not be useful enough for clinical practice when used solely, may be combined with other clinical or biomarker data to identify the clinical scenarios (113,118).

Conclusion

Numerous putative biomarkers have been proposed over the years for bone tumors, which involved multiple research platforms. These biomarkers, mainly for osteosarcoma, Ewing sarcoma, chondrosarcoma and GCT, have prognostic, diagnostic and/or predictive values, and the majority of which were discovered by genomics and proteomics approaches. Despite this, the devastating present-day reality is that none of these tumor biomarkers have managed to get into the clinical utility.

Disclosure

The authors declare that they have no competing interests as defined by Molecular Medicine, or other interests that might be perceived to influence the results and discussion reported in this paper.