The literature review is discussed in a time series format. In the first section, the literature discussed is from 1960 to 1999, the second section of literature is from 2000 to 2015, and the last section is 2015 onwards. The paper selection was done on the basis of good number of citations and publication in prestigious journals and conferences.
Early literature
One of the earliest research papers for cobb angle calculation was by Flint [13] in 1963. The author carried out a study on 31 female college students of age 19–22. The purpose of the study was to determine the correlation between the flexors and extensors of the pelvic hip and to measure the significance of abnormality of the back muscle on lumber posture. Two images of the subject standing behind a mesh of a 2 × 2 screen in a relaxing position. With the help of previous studies, the plumb line for both images was dropped to measure the line of gravity. The radiologist helped in detecting the focal point of the trochanter. Placing 4 landmark points at the spine, over the sacral and lumber junction, over the upper surface, and one over the dorsal surface of the sacrum. Drawing the first line from the dorsal to the apex of curve which intersects the line upwards from sacrum pointers, the STL2 angle on the intersection was the reading. The second reading was taken from the angle taken from the intersection of the lines paralleled to the pointers on the inner-lumbar surface with the sacrum outer surface. This L angle was taken to measure lordotic curve. Larger value of the angle reading indicates a smaller curve and vice versa. PI2 angle of pelvic inclination was calculated by an angle formation between the pelvis horizontal plane and horizontal line from the anterior and the posterior superior iliac spine. For performance measure, the mean and standard deviation for each set and coefficients of correlation between the curve were calculated. STL2 values are not significantly correlated; L2 suggests flexibility of the hip does not impact on the lumber curve. The author stated three basic conclusions; the first is the external measurement for the lumbar provides an accurate measurement of lumbar convexity. The second is that, there is no significant relationship between lumbar lordosis with hip-trunk flexibility. Third is that pelvic inclination which has no impact on the position of the center of gravity.
In 1967, Loebl wrote a paper [14] for the measurement of normal spine posture using rheumatology images. The dataset used was of 176 spine images of age 15–84 years from Westminster Hospital and Queen Mary’s Hospital, Roehampton. The position of subjects for images were sitting, standing, and bent. Inclinometer was used with a 9-cm gap and scale set to degree. Keeping weight needle remain vertical indicated the angle of spine inclination. The spine downward pattern was marked in intervals from D1 to D12. Spine segments were split in four divisions excluding the neck: upper and lower dorsal with dorsa-lumbar junction and lumbar. Nine normal subject’s readings were taken 5 times; 14 ∘ average variation was observed. The variation was due to manual inaccurate readings. For the dorsal spine below age 40 females have a curve 4–5 ∘, straighter than men, but for age above 40, the bent of both genders is almost same. The lumbar curvature varied from 25 to 55 ∘ ± 7∘ in old age. The author emphasized on the importance of curve variations like deep lordosis is due to hip deformity. Thirty-eight rheumatoid arthritis patients measured have no difference from normal ranges, and 15 ankylosing spondylitis results were the same. Only one patient with scaro-iliac joint having a 12 ∘ difference confirmed dorsal spine effected. Improvements were observed after exercise and phenylbutazone. The concluding remark of the author suggests that maximum subjects have accurate angle within 10% of normal ranges.
Mehta [12] discussed the disease scoliosis using radiographs. The convex side of the vertebrae was then rotated in a clockwise direction through an arc of 90 ∘ with intervals of 15 ∘. Producing a series of images with a remarkable variation in appearance. The change in the interval less than 15 ∘ failed to produce a clear difference. Multiple landmarks on composite images were drawn to get a rotation difference. The transverse process is the difference between ranges of 15–30 and 45–60 ∘, which was helpful, but the difference from 75 to 90 is almost the same. The study was carried out for scoliosis images as well determining the extent of curvature change. The image matching method was proposed for the estimation of disease. The methodology provides approximate results but gives a plus point to monitor rotation to 90 ∘.
Levine and Leemet [15], in their paper put an idea of two stage approximations edge detection. To determine where the spine is located, vertical signature of the entire image is taken, and a smoothing method is applied to reduce noise. Later, for edge detection, horizontal scale first derivative was computed and least square polynomial fit was applied in the second stage. To restrict local search of edges mean predictive weighing function improved the process. The polynomial fit order was increased from third to fifth order in an iterative manner for better results. Lastly, the center line was calculated with the help of the median of both edges on a horizontal scale. Chwialkowski et al. [16] in their research article revealed the quantitative measuring strategy for lumbar disc evaluation. The proposed method utilized 12–15 sagittal images; edge enhanced rectangular block with spine size for region of interest (ROI) was specified later. A morphological model of the vertebral structure was designed as a vertebra candidate fitting block was localized. The center-most cluster part of each candidate fitted was considered as an estimated inter vertebral disc. Intensity profiling of each estimated area of disc space is carried out along with the bisection line. For normal belly shape, the intensity curve was cross-compared, and there was an excessive difference that eliminates the abnormal disc. The average gray level 5 × 5 neighboring estimates the width of disc. Tagged findings will verify inconsistent trends in sequences to confirm abnormal discs.
In [17], Smith et al. described usage of active shape models (ASM) to locate the vertebrae in DXA images. These low-spatial resolution images contain noise, providing a challenge for object localization. All vertebrae in the image were marked manually with six points for each from thoracic T7–T12 and lumbar L1–L4. Similarly, point distribution models (PDM) were applied in addition with PCA for simplification of the covariance matrix. Basically, ASM uses both shape-based and gray-level appearances for the detection of objects in image. The results were characterized as a Gaussian distribution with the help of the EM algorithm. Rejected cases were eliminated if their error lays above μ + 3 σ of the successful cases.
The domain of image processing emerged in the late 1960s. That early era of image processing mostly focused upon enhancement and restoration of images. For the medical domain, image acquisition and dataset collection were the major issues for the researchers. This early era of research indicates problems such as low-quality images and limited availability of imaging modalities. The immaturity in the domain was the key factor, and a limited number of available techniques for segmentation and classification are also one of the drawbacks. Roentgenographic [18] and X-ray images were mainly used for differential diagnosis. Morphological processing was popular for extraction of components from images. One of the secondary issues in that era was the limitations of the hardware, such as storage spaces, memory issues, and smaller cycles per second of the processor, leading to low capacity as a whole for complex processings.
Mid-era literature
Brejl and Sonka in their research article [19] proposed the new method using model-based image segmentation. The authors utilized 2D MRI with different dimension of the thorax brain and spine. The training dataset has manual boundary tracing in combination with landmark identification. For contour alignment of training data, the mean and variation shape are set in R-table. Another border appearance model was designed for training dataset. Shape-variant Hough transform and edge-based object segmentation were used for segmentation.
In [20], a report is generated about the development of a segmentation technique using 50 NHANES II X-ray images, for positioning and orientation of the cervical vertebrae. Every image landmark is identified on the basis of morphometric points identified with the assistance of an expert radiologist. The technique is claimed to be noise, rotational, and scale invariant. The generalized Hough transform (GHT) is customized to provide the shape information using its mean and corresponding templates. The built-in accumulator structure helps to deal with noise. The given techniques provide orientation and location along with contour of cervical vertebrae. The average 72.06 out of 80 LMP falls in the boundary box, and the orientation error was average 4.16 ∘.
Gamio et al. [21] proposed a novel research approach for vertebral segmentation from MRI scans. The dataset consists of 6 subjects. Initially, coil correction is applied. It is followed by interpolation with mean values of adjacent slices that generated a 3D stack. Normalization followed by the cropping of the region of interest facilitates to reduce size and computations. Anisotropic diffusion algorithm is used to reduce brightness and for preservation of edges. Normalized intensity based on 3D local histogram provides brightness features, and histogram of texton gives position and intensity features for segmentation. For the localization of the spine bone, normalized cut technique was utilized after that Nystrom approximation method was implemented for vertebral body segmentation. This helps in the determination of results on the bases of difference in nk the number of bins with the rp and rz size of the local volume for histograms.
Peng et al. [22] discussed an algorithm for vertebra segmentation using MRI scans. The proposed algorithm has two stages; the first one is inter-vertebral disk localization, and the second one is vertebra segmentation. In the first stage, the localization is done with the help of a model-based search method which gives clues regarding inter-vertebral disc spaces that are adjacent to each vertebra. The intensity profiling on polynomial function is used to refine and verify the candidate disc spaces. The center point of the disc with extended profiling in the horizontal direction will provide shape approximation. Later, a canny algorithm is used for the boundary extraction of the vertebra. Recalculation of disc space with the help of boundary values and repeat polynomial profiling will identify inter-vertebral disk distances. Seven subjects MRI scans of 412 × 1012 resolution. Successful boundary extraction of 22 vertebrae for 5 datasets is approximately average 94%.
Lin in his research article [23] formulated a 3D spine model with the help of coronal and sagittal planes. Seventeen uniformed segments and 18 nodes with 3D Bezier curves fitting were superimpose over X-ray scans. Plot from top to bottom, an axial view of the space curve was produced. The features selected by the author were curvature and torsion. To evaluate spinal deformity, King’s classification was applied along with multilayer feed-forward, back-propagation (MLFF/BP) artificial neural network (ANN). Dataset of 37 X-rays with a subset of 25 training and 12 testing images, splitting subgroups of training 5 kings [24]. Results for two hidden layers identification rate rn = 0.68 at 300 iterations and then elevated to rn = 0.84 at 900–1200 iterations and for one hidden layer identification rate rn = 0.72 at 300 and 1800 iterations to rn = 0.76 at 1200 iterations.
Xiaoqian et al. in [25] utilized 801 cervical and 972 lumbar X-ray images from NHANES II. The researchers studied a nine-point landmark model which the clinical experts used to explain vertebra shapes. Xu et al. established an automatic system which detects those nine-point model on the basis of their semantic heuristics. Corner information provided by this automatic system has been the initial input for proposed partial shape matching (PSM) using dynamic programming. To reduce the number of data points to 20 vertices, the technique of curve evolution is applied. By eliminating the negative angle, which points in the clockwise direction, positive bend angle might be the corner of a vertebra, creating a line segment between two points on the contour. Using DP, each point set triangle data is saved for every classification matching is carried out from the dataset. A total of 801 cervical and 972 lumbar segmented shape dataset was formed from a total of 400 X-ray images.
Benjelloun and Mahmoudi in [26] performed a comprehensive analysis for the extraction of anterior left faces of the vertebra contour. They focused on extracting angular variation in combination with the spine column and by computation of angles for global curvature. They rely upon Harries’ interest point detector for key points. For mobility analysis using supervised one click corner of the vertebra, the left boundary is extracted. Encircling the point clicked, research zone direction tangent is determined. The distance between two corners inside the circle is measured and is supported by pre-processing. Contrast enhancement gives better results in corner recognition. A dataset was used of 100 images from the National Library of Medicine providing stability, speed, and satisfactory results.
Tobias et al. in their research article [27] described a framework for vertebrae detection and segmentation using CT images. The methodology started with curve extraction using sophisticated generalized Hough containing parameters of multiple shapes and a number of objects. The vertebral coordinate system describes the location and coordinates of vertebra in combination with a seed point progressive adaptation method. For the identification of vertebra, average intensity information inside each vertebra bounding box was utilized, and for segmentation of every single vertebra, adapting triangulated shape models was applied.
In [28], Ribeiro et al. used 40 cases from which 19 are confirmed fractured bones and 22 normal spinal images. Gabor filter bank with 180 filters was applied on these gray-scaled images at angles θ = - π/2. Resultant data provides fine orientation details due to high response over edges and corners. The response then later weightage calculated by addition of sine and cosine of twice of significant angle. In the center of the vertebra, points were marked using a mouse for distance calculation and region splitting. Using the neural network, the logistic sigmoid function for more detailed analysis, along with that morphological opening and closing, is carried out for holes, noise, and region filling. The accuracy of the proposed system is quoted to be in the range of 91–92%.
Anitha and Prabhu in [29] proposed a methodology for automatic quantification of spinal curvature using 250 radiographs. They split the radiographs in group category on the basis of degree of Cobb angle. They initially enhanced the input image, then Snake-I methodology calculates the initial boundary to make it less sensitive to noise gradient vector field; Snake has provided better results. The boundary that is extracted has been enhanced and retained using some morphological operations using a structuring element. Hough transformation has calculated the slope of the horizontal lines of boundary. This has given exact information about vertebrae eliminating inter/intra-observer error.
In [30], Larhmam et al. claimed 89% accuracy in their approach where they studied out of 200 images of 40 healthy cases and validated the result. They used Hough transformation based on a modified template matching methodology. This method is translation-, rotation-, and scale-invariant. For segmentation, the first step was model construction on geometry of average of 25 vertebras. In the second step, canny was applied and Gaussian smoothing was used in combination with Sobel operator and non-maxima suppression. Hough transform was the third step of the segmentation process to reduce false-positive edges. Vertebra potential center identification is carried out through contrast limited adaptive histogram equalization (CLAHE) in line with Canny and Sobel for edge detection. Finally, on Hough R-Table, linear regression selects the highest voted point.
Sardjono et al. in [31] highlighted the Cobb angle determination using 36 X-ray images. The authors underlined Jalba et al. CPM technique from 2004, for the vertebra edge detection. Charged particle method provides segmentation based on charged particles attracted to the contour of the object whose gradient magnitude had helped to locate the boundary on the object. To identify the S curve, three parts of the spine in the vertical direction had determined two angles of Cobb, while for C curve, two parts of the spine had identified a single Cobb angle. Piecewise linear curve fitting method with a slope of curve in line with splines, steps, and polynomial function had determined Cobb angle. On 36 X-ray images, R2 is measured with different segments and steps providing satisfactory results.
Rasoulian et al. in [32] suggested a novel shape pose segmentation technique. The dataset used for processing consist of 32 CT images acquired from Kingston General Hospital and Vancouver General Hospital. Using group-wise Gaussian mixture model (GMM) based on registration technique boundary was established. To obtain statistical shape, the principal component analysis (PCA) helped in parameter selection. Expectation maximization registration algorithm of segmentation was used. Sorting of eigenvalues and weights to PGs improved registration speed. To smoothen CT scans, the multivariant Gaussian kernel was utilized in combination with a canny edge detector to produce boundary. Results calculation: The matric mean of point-to-surface distance error was computed to be 1.38 ± 0.56.
The first decade of the new millennium was significantly good for DIP as the domain was gaining both popularity and maturity. The increase in demand of medical imaging, improved quality of images, and multiple accusations and enhancement methods opened the doors for reliable diagnosis. Nevertheless, the noise in medical images, low contrast, and brightness issues were the main problems highlighted by researchers. Thus, better ways of preprocessing were introduced. Many new filters and noise removal techniques were applied to improve results. The evolution in the hardware industry also facilitated in the processing and storage solutions. Novel segmentation and classification techniques were introduced and applied on medical imaging which enhanced the results by leaps and bounds.
Recent literature
In a study [33], Korez et al. developed an automated supervised segmentation method for vertebral bodies with the help of 3D convolutional neural network (CNN). The dataset consisted of MRI scans of 23 subjects, 3D mesh of mean shape model of vertebral body formation. Later on, CNN supported to provide generalized probability map of VB. Figure 4 shows their entire model. The actual novelty of the proposed technique is 3D spatial VB probability maps. For the evaluation of the proposed methodology, the dice similarity coefficient was calculated 93.4 ± 1.7%.
To reduce the misdiagnosis from CAD systems, Arif et al. in their research article [34] proposed a fully automated cervical segmentation framework using a deep, full CNN. With the help of probabilistic spatial regression, localization of the vertebrae center is done. For segmentation, datasets from Royal Devon and Exeter Hospital containing 124 X-ray images in training and 172 in test data were utilized. Without any manual input, a shape-aware deep network was formulated. Evaluation metrics achieved dice similarity coefficient of 0.84 and a shape error of 1.69 mm.
In the paper [35], Shi et al. developed two leveled methodology for vertebral localization and segmentation with the help of the intensity pattern in combination with CNN for GPU accelerations. In the initial step, spinal region extraction is carried out using 2D U-Net variants. Later, for each vertebrae, centroid localization is done by applying M-methodology which resulted in producing a 3D ROI. Later, with inception 3D U-net was utilized, training on 61 annotated CT images. The correct identification of 92% and error rate of 0.74 mm with 0.8mm of dice coefficient was achieved. In [36] paper, Lu et al. described a fully automated deep-learning approach for lumbar spinal stenosis grading. The research provides three major contributions: first, NLP scheme to extract level-wise ground truth labeling from radiological reports of multiple types and grading of spinal stenosis; second, disc-level vertebral segmentation and localization using an U-Net framework in combination with a spine-curve fitting; third, contribution was usage of multi-input, multitask, and multi-class CNN to execute central canal and stenosis grading. Massachusetts General Hospital (MGH), Department of Radiology, gave lumber MRI dataset of 22,796 disc levels extracted from 4075 patients. The proposed algorithm gives an accuracy of 94% for both the spine canal and foraminal stenosis.
In [37] paper, automatic landmark localization using the WHDV method that includes U-Net architecture of CNNs was used. For the training of CNN, a modified version is used for estimation Gaussian response known as “heatmap only” around each target, using predictions to vote for the point position. For evaluation, dataset of 1696 radiographical images of child hips age 2–11, including both cases of normal and diseased was used. Experimental result accuracy shows significant improvements in comparison with the RFRV-CLM method, having a median error of 6.92% and 5.85%, respectively.
Kim et al. in [38] proposed a novel semi-automatic segmentation algorithm for the vertebrae in lumber MRI. After extraction of ROI for each vertebra, specify the parameters with the help of a correlation map. ROIs are tuned with Hough transform and canny edge filtering. Later, segmentation is carried out via graph-based and line-based algorithms. Algorithm testing on lumbar sagittal MRI dice similarity coefficient reached to 90%, in comparability with manual.
Rehman et al. [39] discussed the CAD system for accurate vertebrae segmentation with the help of region-based deep learning. Modified form of U-Net with level ant and combination of shape prediction are applied. They are termed as FU-Net framework. Training network was applied on 500 epochs with early convergence which achieved 0.9 momentum learning rate and 0.2 dropout among two adjust layers. For all experiments, 2D sagittal slices are applied and to improve learning performance data augmentation is compulsory. The proposed methodology produced 96% dice score and 0.1 ± 0.05 absolute surface distance on two different datasets of CSI 2016 and CSI 2014.
In [40], the issue of vertebral segmentation has been discussed along with detection of vertebral abnormalities. The research emphasized on paradigm of deep learning on medical images. Chuang et al. suggested an iterative segmentation model 3D U-Net and DeconvNet that segments all categories of vertebrae, which includes the cervical, thoracic, and lumbar vertebrae. The authors have used xVertSeg dataset. Cross-entropy is used as loss function for the multi-label classification. The authors claimed better performance from previous methodology of Lessmann et al. and is memory efficient as it used 17% less memory.
Lessmann et al. in there article [41] addressed the vertebrae segmentation and identification of abnormalities. They proposed an iterative approach by using fully CNN to segment and label vertebrae. The following are the four major components of the authors’ approach: (1) segment voxels from a 3D patch, (2) instance memory, (3) identification sub-network, and (4) completeness classification sub-network. Dataset computational spine imaging (CSI) of 2014 and thoracolumbar spine CT is utilized. Average dice score of segmentation 94.9 ± 2.1% and 93% correct anatomical identification. Figure 5 presents some of the segmentation results.
Aubert et al. in their paper [42] proposed automated 3D spine reconstruction technique. A total of 400 images training dataset with a mean Cobb angle o 43 ∘ having idiopathic scoliosis was taken from Saint-Justine University Hospital. The statistical spine shape modeling that depicts global view of spine curve along with local shape of vertebra. It has geometrical exhibition using simplified parametric model (SPM) of the vertebra. Spinal cord automatic landmark detection is carried out using CNN patch-based regression. Strategy named coarse-to-fine was formulated for automatic 3D reconstruction. The landmark mean (SD) location error from 3D Euclidean distances were 1.6 (1.3) mm from the vertebral body center, 1.8 (1.3) mm from the endplate centers, and 2.3 (1.4) mm from the pedicle centers. In the latest era, last 4 years of research was discussed indicating great work and progress in medical imaging. The research in these years indicates the usage of different CNN architectures. The popular CNN method is commonly used for segmentation. CSI datasets are in demand for spinal research. Notably, CNN dominates but require a lot of images for good training and testing. Medical data is hard to collect and that data also require medical assistance for annotation and labeling. With great progress in the results, no imaging technique is free of noise. Therefore, issues like illumination, artifacts, and low contrast effecting segmentation require attention to improve results.
Pasha et al. [43] conducted a research study from X-rays of 103 adolescent idiopathic scoliosis patients. 3D spine model was used to measure clinical parameters such as thoracic Cobb, lumbar Cobb, sacral slope, pelvic tilt, thoracic kyphosis, and lumbar lordosis, by connecting the T1–L5 vertebral centroids to formulate 3D curve. To normalize spine heights, isotropic scaling was carried out. A total of 17 Z level were calculated with the help of T1–L5 coordinates which produced a normal spine cohort. The agglomerative hierarchical clustering was used to merge similar spines in one cluster. The difference in clusters identifies the maximum dissimilarity with normal spine. Almost 3 anatomical views of spine were determined in each cluster. The results indicated 5 different 3D curves in right thoracic; maximum dissimilarity was in patients of hypo-thoracolumbar kyphotic (44%) and flat sagittal profile (56%). Both sagittal and frontal imbalances were found.
In 2019, Chen et al. [44] proposed to use a 3D Full CNN in combination with hidden Markov model (HMM) for the identification localization of the vertebrae. The authors utilized 242 CT scans for training along with 60 scans for evaluations from public dataset of MICAAI challenge. Initially, FCN was used for training and detection of vertebra centroids. Second, FCN network was formulated for both local and global information from scans, this classification network handle indexing of vertebrae. The authors proposed post-processing strategy to increase the robustness and to achieve high-level optimization HMM. Experimental results on test data produced a mean identification rate of 87.97% and a mean error distance of 2.56 mm. In [45] paper, Pastor et al. conducted a study on 232 CT scans with different arbitrary field views in a period of 12 moths. The dataset was split into two groups: 186 scans for training and 46 scans for testing. In the first stage, a manual centroid annotation is done. In the later stage, learning-based decision forest method was implemented. Detection procedure was based on random regression forest (RRF) for localization and identification of the vertebrae. Voxel-wise operations are applied for the improvement in results. Image binarization and dilation followed by logical NOT were performed achieving the identification rate of 79.6%.
Vergari et al. in their research [46] proposed a classification technique for the automatic detection for scoliosis. The dataset originally consisted of 796 radiographs and was augmented up to 2096 images. Later, it was divided into 1892 training images and 204 validation images. The classification method was inspired from the architecture of LeNet-5. Three convolutional layers followed by batch normalization and max pooling layers in combination with dropout layer at the terminal. The results are further processed through discriminant analysis. This discriminant analysis refined and improved accuracy level of correct classification rate up to 96.5%.
In 2020, Alharbi et al. [47] evaluated their approach for automatic scoliosis angle calculation with 243 images dataset supported from King Saud University, Riyadh, Saudi Arabia. The performance evaluated reached to 90% accurate detection. The authors used the CLAHE method in preprocessing to enhance the X-ray quality. The ResNet CNN framework was utilized for the detection of the vertebrae. Transfer learning supports the algorithm results. Cobb angle measurements were calculated after corner of each boundary box help to calculate the center point. Measuring line angles between each center of the vertebrae to find angle the y-axis is measured by addition of 90 ∘ and subtracting it from 180 if the resultant angle gets greater than 90 ∘. The difference of minimum and maximum line angle produce a Cobb angle. The algorithm used novel approach to calculate the Cobb angle, with a small difference of 5–10 ∘ from a clinical way.