UAV-Landing Inclination Dataset: Enabling Inclination-Aware Surface Detection from UAV | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article UAV-Landing Inclination Dataset: Enabling Inclination-Aware Surface Detection from UAV Ishan narayan, Dapinder Kaur, Neeraj Battish, Shashi Poddar This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8472654/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 9 You are reading this latest preprint version Abstract Unmanned Aerial Vehicles (UAVs) has emerged as a transformative tool for 3D reconstruction, offering diverse applications in urban planning, infrastructure monitoring, and emergency response. This work introduces a combination of synthetic and real-world visual image dataset for estimating inclination of surfaces from UAV and is termed as UAV-Landing Inclination Dataset (UAV-LID). This work also proposes an ensemble deep learning architecture that carries out detection of possible landing surfaces and their inclination angle estimation. The surface detection architecture uses YOLOv7 module for surface detection while the inclination angle estimator uses different backbone architectures to estimate inclination. The dataset consists of visual images of different kinds of possible surfaces at different heights and inclination angles. Different backbones such as VGG16, EfficientNet, and ConvNext based architectures have been experimented here for the task of inclination estimation, of which the EfficientNet based architecture shows promising performance. Experimental results show that deep learning-based networks can be used effectively for this purpose and in future, can be extended for landing of UAV on slanted surfaces directly. UAV landing inclination deep learning detection ConvNext EfficientNet Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1. INTRODUCTION Accurate surface inclination estimation is critical in various applications, such as drone based solar panel inspection, autonomous UAV landing, and drone-based delivery. Landing site selection is an important aspect of autonomous UAV which uses 3D point cloud to represent terrain information and assess the feasibility of landing on a site. Among several landing sites, rooftops present a unique challenge due to their diverse geometries and varying inclination angles, affecting the interpretation of aerial imagery. Accurately determining the inclination angles of surfaces, particularly rooftops, is crucial for improving the precision of autonomous vision-based landing. Among several architectures available for estimating terrains, commonly used approaches include combination of LIDAR and vision-based systems. LIDAR being computationally intensive, are not suitable for small form factor UAV systems and have several other limitations. Vision based landing site selection has therefore gained popularity over the last few years and aruco-markers based landing is most common. However, this cannot be generalized for unknown surfaces at different inclination. It is therefore necessary to devise mechanisms by which the vision-based landing site selection can be carried out without the need of any existing markers. This work introduces an ensemble-based learning approach that aims at estimating the inclination angle of any detected surface from the downward looking camera mounted on a UAV. Although several architectures exist for autonomous driving scenarios, much less has been explored for the top-down views from the UAV and is still an evolving field. Consequently, the lack of comprehensive datasets tailored to this problem has limited the advancements in this field. Most existing datasets focus on general object detection or segmentation, while the estimation of surface parameters like inclination and roughness still remains an ill-poised problem. Therefore, a novel dataset specifically designed to address the challenge of surface inclination estimation has been generated and experimented here. Both real and synthetic dataset has been prepared comprising of high-resolution synthetic and real UAV images of surfaces and rooftops with annotated bounding boxes and corresponding inclination angles. This dataset aims to fill a critical gap in aerial data analytics needed for autonomous UAV operations by providing a benchmark for evaluating machine learning and computer vision models. The synthetic dataset is created using the ROS (Robot Operating System) Gazebo simulation platform while the real-world simulation uses an industrial quadcopter for capturing images. This dataset is a valuable resource for researchers focusing on inclination estimation and lays the groundwork for advancing downstream tasks like surface identification, structural monitoring, and precision planning. The diversity of scenes and rooftop types in both the simulated and real-world environment ensures that the dataset can generalize well to real-world applications. Accordingly, the main contribution of this paper are (a) Ensemble based machine learning approach to estimate surface inclination angle; and, (b) real-world and synthetic dataset of different surface types with different inclination angles. We introduce the UAV-Landing Inclination Dataset (UAV-LID), a benchmark dataset that provides precise ground-truth inclination angle annotations of potential landing surfaces for UAVs. Unlike existing datasets that classify surfaces coarsely (e.g., flat vs. non-flat), UAV-LID offers fine-grained, degree-level inclination labels across diverse terrains, enabling robust development of learning-based safe landing and slope estimation algorithms. The rest of the paper is organized as follows: section 2 discusses related work in inclination estimation and existing datasets; Section 3 provides a detailed overview of the dataset, including its structure and generation process; Section 4 details the proposed methodology for surface detection and inclination estimation; section 5 analyses the proposed ensemble approach for inclination estimation using different backbone architectures, and finally, section 5 concludes the paper and outlines future directions. 2. BACKGROUND AND RELATED WORK Autonomous landing site selection for unmanned aerial vehicles (UAVs) represents one of the most critical and challenging aspects of autonomous flight operations, requiring sophisticated decision-making capabilities across diverse and unpredictable terrain environments [ 1 ]. The landing site identification can be on a natural unknown surface and on pre-known marked sites. While marker-based approaches offer higher precision and reliability in controlled environments, they are inherently limited by their dependency on pre-installed infrastructure. For unstructured terrains, the selection mechanism often integrates multiple visual and sensor-derived factors into a weighted decision framework [ 2 ]. Recent advances in autonomous landing systems integrate sophisticated computer vision techniques, semantic segmentation networks, and pose estimation algorithms to enhance landing site identification [ 3 ] [ 4 ]. This work emphasizes on the image-based terrain parameter estimation, specifically, inclination which is considered as the most important factor for landing site identification. Relying solely on RGB sensors and annotated labels, reduces hardware complexity and processing overhead. 2.1 Traditional Terrain parameter estimation Most of the vision-based architectures aim at multiple feature estimation from the images captured by the camera, such as texture, color, shape, and geometric cues that can determine the suitability of candidate landing sites. Early methods focused on edge detection, gradient operators and non-maximum suppression processes to estimate local slopes and surface normals from images [ 5 ]. In [ 6 ], the authors devised a technique for vision-based surface slope estimation for UAV perching using a monocular camera to estimate slope and may struggle with noisy real-world imagery or textured environments. Other traditional approaches employed pose estimation and keypoint extraction, inferring tilt or orientation by analyzing geometric transformations between image features and the camera pose [ 7 ]. Some terrain parameter estimation pipelines also utilize elevation models, where slope is defined by differences in grid cell elevation across a rasterized representation of the terrain [ 8 ]. While such methods produce good results, their accuracy and robustness depend critically on the availability and accuracy of external depth maps or LiDAR measurements. This reliance limits their applicability in lightweight, vision-only UAV deployments or where real-time operation is needed [ 9 ] . Lim et al. [ 10 ] proposed a UAV framework that fuses semantic segmentation and depth data from an onboard RGB camera and LiDAR sensor to identify and land on safe spots in complex environments. In [ 11 ], the pipeline for UAV landing site selection compares different candidate landing site options by evaluating the terrain characteristics, such as slope and roughness, to determine the safest and most feasible landing locations. Chatzikalymnios and Moustakas [ 12 ] proposed a vision-based framework to analyze rooftop environments, where Gaussian processes are used to model feasible landing zones and ensures that UAVs not only identify candidate regions but also quantify confidence in their suitability. Such approaches highlight the importance of integrating machine learning and structural priors into landing site evaluation, offering a scalable solution for real-world autonomous operations. There are still several disadvantages in extracting surface information from 3D point cloud data. In remote sensing applications, estimating surface roughness using satellite imagery and multispectral imaging is an essential aspect, requiring surface elevation models to be known. 2.2 Deep learning-based terrain parameter estimation Integrating deep learning architectures into UAV landing site detection has been benchmarked with unprecedented accuracy and robustness compared to traditional vision-based approaches. Park et al. [ 13 ] developed a stereo vision-based landing site search algorithm employing convolutional neural networks (CNNs) to identify flat, obstacle-free regions suitable for landing. Similarly, Mittal et al. [ 14 ] employs encoder-decoder networks with skip connections to extract multi-scale features from stereo pairs, enabling accurate depth estimation and terrain classification. Neves et al. [ 15 ] proposed a multimodal transformer-based architecture that processes RGB, thermal, and depth information to provide global context understanding, crucial for distinguishing safe landing zones. Marcu et al. [ 16 ] introduced SafeUAV, an encoder–decoder CNN framework that leverages synthetic training data to learn depth estimation and safe landing area prediction for UAVs. Deploying a trained CNN-based model for landing-site recognition significantly reduces onboard hardware overhead and overall payload constraints, while eliminating the need for additional communication protocols to maintain maneuverability using terrain cues alone. Model pruning and compression further enable efficient deployment on resource-constrained edge devices commonly used in UAV platforms. However, practical deployment necessitates a systematic investigation of achievable inference frame rates on embedded hardware, as well as the varying spatial extent of the scene to be processed at different flight altitudes. Furthermore, real-world operations often involve non-nadir camera configurations due to vehicle dynamics and environmental disturbances. To address this, the proposed dataset incorporates diverse camera tilt and viewpoint variations, ensuring robustness of the model under realistic flight conditions. 3. METHODOLOGY This work aims to provide an architectural pipeline that can be used to predict the inclination angle of the surface from an aerial view. An ensemble learning approach is devised here which consists of two parts, that is, Deep learning-based Surface Detection (DSD) and a Deep regression network for Inclination Estimation (DIE). Ensemble learning has been widely utilized to enhance model performance by integrating multiple architectures in one framework. The DSD architecture detects different possible surfaces in an image frame and provides the corresponding bounding box coordinates, and class labels. Object detection methods, particularly YOLO (You Only Look Once) variants, demonstrate significant capabilities for different use cases. One of the variants, the YOLOv7 [ 17 ] architecture has been used here for detection of surfaces and the bounding box coordinates of different surfaces as training data. The detected surface contours are then cropped from these frames and passed onto the next stage where their inclination is estimated. The deep neural architecture with different backbones such as artificial neural networks (ANN), VGG[ 18 ], ConvNeXt [ 19 ], and EfficientNet [ 20 ] architectures have been incorporated in the DIE framework along with linear regression to yield inclination angle from these cropped image portions. Artificial Neural Networks (ANNs) typically consist of fully connected architectures that map inputs to high-level representations. Although widely applied across diverse problem domains, such shallow networks are limited in their ability to generate complex features compared to modern deep learning architectures. Among deep learning backbones, VGG16 is a widely used framework that employs small convolutional kernels stacked hierarchically to capture fine-grained spatial details, making it a robust model for image classification and feature extraction. EfficientNet, on the other hand, introduces compound scaling to jointly optimize depth, width, and resolution, and leverages inverted bottleneck layers (MBConv) [ 21 ] with squeeze-and-excitation (SE) modules to efficiently capture complex representations while maintaining high accuracy. More recently, ConvNeXt incorporates depthwise convolutions, enhanced normalization, and transformer-inspired design refinements, thereby balancing the efficiency of convolutional networks with the representational capacity of vision transformers. In the context of UAV landing perception, such a design is particularly advantageous as it generalizes well across diverse terrain features and scales effectively to high-resolution aerial imagery. These properties make ConvNeXt a strong backbone for reliable inclination estimation, a key requirement for safe autonomous landing. Collectively, these architectures have been selected for their proven utility across a range of computer vision tasks, with each offering unique features that enable comparative insights. In the proposed Deep Inclination Estimator (DIE), these backbone architectures is employed to extract high-level features, followed by a regression head that predicts precise inclination angle values. The DSD architecture processes visual images captured by the camera and generates a set of bounding boxes that encapsulate the surface. Each bounding box also outputs a class label (indicating a surface) and a confidence score. The bounding box coordinates ( x, y, width, height ) and the associated labels are then passed to the next stage of the pipeline. These regions are resized and normalized to match the input requirements of the DIE architecture. The DIE model processes these images through multiple convolutional layers and other architectural components, extracting high-level features that capture surface characteristics. A regression layer at the final stage predicts the surface angle as an output in degrees. During training, the network minimizes the difference between the predicted and the ground truth angles through backpropagation, optimizing its ability to estimate inclination angle accurately. During inference, the trained DSD model takes an input image, applies YOLOv7 to detect surfaces, and then uses the corresponding bounding box coordinates to extract the relevant surface regions. These regions are passed on to the DIE model, which yields the predicted angle for each detected surface. The proposed ensemble approach is novel in terms of its arrangement and does not require complex architectures to detect surface and its angle estimation in one single framework. This architecture can be further improved by interchanging better YOLO versions or better regression architecture in future, as the need arises. The overall architecture of the proposed ensemble learning framework consists of sub-modules for detection of potential surfaces from the image, cropping the detected surfaces with additional padding on all sides, followed by their input into the DIE model as shown in Fig. 1 . 4. DATASET GENERATION Datasets form an essential component for carrying out image-based data analytics operation and the dearth of real-world aerial data forces users to rely on synthetic data alone. Synthetic datasets for UAV and aerial scenarios primarily target traditional computer vision tasks including object detection, semantic segmentation, depth estimation, and classification. Some of the synthetic datasets such as: UAV-City dataset targets semantic segmentation; UAVScenes dataset [26] provides semantic annotations for both images and LiDAR point clouds, and Air2Land dataset [27] addresses UAV landing scenarios in different lighting conditions. Despite the abundance of synthetic datasets, a critical gap persists: the absence of datasets specifically designed for surface inclination estimation which is a fundamental parameter for UAV landing site selection. To address this gap, inclination-focused real and synthetic image dataset has been generated here and is termed as UAV-Landing Inclination Dataset (UAV-LID). It incorporates precise ground truth labels and inclination angles for visible surface patches, enabling quantitative evaluation of inclination estimation accuracy. This section provide details of the experimental procedures undertaken for generating the synthetic data and preparing the real-world dataset. Synthetic dataset generation This pipeline consists of creating a simulation environment in ROS-Gazebo using different surfaces with predefined angles of inclination of surfaces captured using visual camera mounted on a UAV from different altitudes. Ardupilot was used for the flight controller's software in the loop plugin availability, assisting in running the framework directly on the PC without any external hardware requirement. Different information parameters, like height, velocity, etc., are sent through inertial measurement unit (IMU) to the ROS Node, enabling the UAV camera to record ground-looking visual images. The UAV camera has been positioned to look at the ground downward at different angles and a live feed is obtained from the UAV camera and recorded using ROS as shown in Fig. 2 . A brief overview of different environment types and the parameters considered for dataset generation is provided in Table 1 . Here, inclination angles refer to the inclination of different surfaces in the image. The surfaces in the rural scenario consist mostly of huts, whereas the surfaces in semi-urban and urban scenarios consist of buildings and houses at different inclinations. The huts are textured as made of straw, semi-urban house tops are textured as made of bricks/ concrete, and the urban buildings are textured to be made of concrete/ wood so as to create a real world landing scenario during drone based deliveries. A mixture of textures using different materials, patterns and background colors help in generating diversity in the image as texture plays an important role in shape representation. Table 1 Parameters considered for dataset generation Simulation environment Surface type Inclination angles Height of UAV Urban Buildings, apartments, House, gas station 0֯, 15֯, 20֯, 25֯, 30֯, 35֯, 45֯, 55֯, 57֯ 25, 40, 50 m Rural Hut 15֯, 30֯, 45֯, 60֯ 25, 40, 50 m Semi-Urban Hut, House 55֯, 57֯ 25, 40, 50 m The roof-top contour represents the landing surface for simplicity, and the inclination angle of the surface is the ground truth value known apriori. The overall pipeline of dataset capturing, and image labelling is shown in Fig. 3 . The label annotations include inclination angle for different surfaces and has been labelled manually for each image and is available online along with the dataset. The ground truth data is organized in CSV format with the following fields: image_name, x, y, w, h and corresponding inclination angle. The synthetic dataset includes the images captured by a UAV in three different view angles with three different surroundings, as mentioned in Table 1 . Table 2 Number of images generated in the dataset for different environment settings Environment View Angle 90° 75° 60° No. of images (Total) 1027 (58%) 470 (26.5%) 270 (15.2%) Urban (51.3%) 459 330 117 Semi Urban (12.05%) 105 55 53 Rural (36.68%) 463 85 100 The dataset contains 1767 images in total, from which 51% of the data is from urban settings,12% from semi-urban settings, and 37% from rural environment settings. The images captured in different environments contain a minimum of one to a maximum of eight different surfaces with different inclinations. Table 2 contains the details for the total number of objects (surfaces) present in the images captured in different environments with different camera view angles. Table 3 contain the details for the total number of objects present in the images captured in different environments with different camera view angles. Table 3 Total number of objects presented in the images captured in different environmental settings Environment View Angle 90° 75° 60° Urban 1360 621 239 Semi-urban 298 238 490 Rural 942 252 469 Real-world dataset generation The real-world data collection experiments were conducted in full compliance with the local laws for flying drones for experiment and dataset collection. Aerial videos and images were captured under these guidelines to ensure data quality, adherence to safety standards, and the generation of clear, usable datasets that can be further extended for similar applications. For this purpose, an industrial drone was used to capture top-down images of the surfaces from different altitudes. Several wooden slabs with printed sheets on top of it were used to represent different surface textures as shown in Fig. 4 . These slabs were positioned at multiple inclination angles with respect to the ground plane, ensuring that a wide range of slope variations were incorporated. The angle of inclination was set using an inclinometer and the UAV was flown on top of it to capture images at different heights. This procedure ensured that both scale variations and perspective consistency were adequately captured. This experimental setup enabled the creation of a comprehensive dataset covering a wide range of surface types and inclination angles. The non-blurred images were labeled manually for ground truthing purposes and sent to the inclination estimation framework. The real-world dataset consists of both single objects in a frame and multiple objects in a frame for both training and test data. The UAV camera angle is fixed here at 90⁰ downwards, whereas surface angles are variable in the collected data. Almost two images are captured for each surface at each altitude with different surface angles. Figure 5 presents the details of the dataset images with respect to surface angle and altitudes. In addition to this, a 3-minute video, which consists of multiple surfaces in one frame with different inclination angles, was recorded to analyze the performance of both the detection and inclination estimation pipeline. This video contained around 2000 usable frames, which are also included in the training and testing data. 5. EXPERIMENTATION This research work proposes an ensemble deep learning pipeline to detect and determine the ground surfaces and analyze their safety with respect to UAV landing. The experimentations are rigorously performed for the ensemble architecture in estimating inclination angle of surfaces for both the real world and the synthetic dataset. All experimentations have been performed on a Windows-based platform using Python programming with different computer vision and deep learning libraries. This section provides details of the experimentations, including the training parameters, results, and comparative analysis. 5.1 Training Parameters This ensemble pipeline is trained on a workstation with NVIDIA RTX A4000 (16GB) GPU and 36GB RAM. For experimentation, 70% of the total data is used for training, 10% for validation, and 20% for testing. The other training parameters for ensemble pipeline are given in Table 4 . Table 4 Training Parameters for the ensemble pipeline Parameter Values Surface Detection Architecture Input 640 Batch Size 16 Epochs 300 Learning Rate 0.01 Optimizer ADAM Inclination Estimation Architecture Input 64 Batch Size 32 Epochs 300 Optimizer ADAM 5.2 Experimentation Results The experimentation is conducted for both detection and regression architecture separately in order to evaluate the efficacy of the proposed ensemble pipeline. This section provides qualitative and quantitative analysis for both the architectures. Performance Analysis of Detection Pipeline The surface detection task is performed using the YOLOv7 architecture, and its performance is evaluated through standard detection metrics, including precision, recall, mean average precision (mAP) at various thresholds, and detection rate. Precision quantifies the proportion of correctly predicted instances among all predicted positives, representing the model’s ability to avoid false detections. Recall, on the other hand, measures the proportion of actual positive instances that are correctly identified, reflecting the model’s sensitivity. The [email protected] corresponds to the Mean Average Precision computed at an Intersection over Union (IoU) threshold of 0.5, where a detection is considered correct if the overlap between the predicted and ground-truth bounding boxes is at least 50% (IoU ≥ 0.5). Additionally, the [email protected] :0.95 metric averages precision across multiple IoU thresholds ranging from 0.5 to 0.95 in increments of 0.05, providing a more comprehensive measure of detection robustness. The analysis for both synthetic and real-world datasets based on these metrics are summarized in Table 5 . Table 5 Overall performance analysis of surface detection on synthetic and real-world data based on different performance metrics Synthetic Data Real-World Data Precision Recall mAP 0.5 mAP 0.5–0.95 Precision Recall mAP 0.5 mAP 0.5–0.95 At 300 epoch 0.8656 0.9221 0.9238 0.6468 0.97 0.9939 0.9902 0.7621 Average 0.8237 0.8258 0.8257 0.5168 0.8899 0.8909 0.8946 0.6255 The precision results at the 300th epoch, along with their average values, demonstrate strong precision performance, indicating that most surface regions are accurately detected. However, the observation of high recall and elevated mAP values at the final epoch, coupled with comparatively lower average values, suggests slight overfitting of the model. This issue can be mitigated through model fine-tuning or by enhancing dataset diversity to improve generalization. The performance of the YOLOv7 model for surface detection is illustrated in Fig. 6 . Three distinct environmental settings—urban, semi-urban, and rural—along with three camera orientations (90°,75°, and 60°) were considered for data simulation using the ROS framework. The surface detection outcomes for these inclination angles are illustrated in Fig. 6 (a–c). As shown in Fig. 6 (a), surfaces featuring multiple inclination angles are successfully identified as separate surfaces. In contrast, Fig. 6 (b) presents a false positive case, where two surfaces with different inclinations are incorrectly detected as a single surface. surfaces with well-defined structural boundaries are detected with high confidence, as evident in Fig. 6 (c). Figures 6 (d – f) depict real-world UAV-captured surfaces, the distinct surface separation results in relatively higher detection accuracy, thereby minimizing false detections. Table 6 Detection accuracy (in %) for surface detection using YOLOv7 in different environments and views for synthetic dataset Environment View Angle 90° 75° 60° Urban 97.8 96.7 89.05 Semi-urban 95.6 83.9 90.3 Rural 96.6 95.1 94.1 Table 6 provides the surface detection accuracy for different environments and view angles. As seen, the accuracy for the 90° view angle is highest as compared to the other two view angles of the UAV. This can be attributed to the improper mapping of surfaces in the camera at different view angles. These detected surface regions are used as an input to the regression network to detect their inclination angle. The proposed deep regression network architecture with different backbones is trained using the original ground truth inclination values to balance the models’ capabilities. The training parameters given in Table 3 are kept same for all regression networks and computations are performed accordingly. In real-world data, the overall detection accuracy of 94.5% is achieved with a camera view angle of 90⁰. Performance Analysis of Inclination Estimation Pipeline The proposed architecture is evaluated for both synthetic and real-world data in separate experiments. The performance in terms of inclination angle prediction is analyzed for different deep learning backbones such as ANN, VGG16, ConvNext, and EfficientNet. Their performance in terms of accuracy is computed at three different error thresholds, i.e., T1 \(\:(\pm\:5^\circ\:)\) , T2, \(\:(\pm\:2^\circ\:)\) , and T3 \(\:(\pm\:1^\circ\:)\) . Here, the error threshold defines that the predicted inclination angle is within a specified range of the true value defined by T1, T2, and T3. Table 7 compares the overall performance of surface inclination angle prediction for both synthetic and real-world data. Table 7 Accuracy comparison of different backbones in the proposed deep regression architecture for inclination estimation DIE backbones Synthetic Data Real-World Data T1 T2 T3 T1 T2 T3 ANN 72.37 54.81 35.97 68.36 52.54 43.50 VGG16 25.09 15.78 9.23 74.58 59.32 42.94 ConvNext 84.9 73.77 60.81 76.84 66.10 46.89 EfficientNet 85.97 77.3 63.49 86.44 80.23 77.40 It can be seen from Table 6 that the performance of VGG16-based regression is different in both synthetic and real-world data. As seen, the accuracy of VGG16 in predicting inclination angle in the case of synthetic data is very low. However, its performance is relatively similar to ANN and ConvNext with real-world data. The EfficientNet-based DIE architecture shows good accuracy compared to other architectures for both datasets and is therefore the preferred choice here. However, this backbone can be replaced with newer backbones as they become available in the literature and is only indicative of the performance achieved through this ensemble architecture. The performance is further analyzed in two different scenarios, where scenario-1 provides a comparative analysis based on different view angles using synthetic data, and scenario-2 gives a performance analysis based on different altitudes for real-world data. Tables 8 and 9 display the results for these scenarios, respectively. Table 8 Accuracy comparison of different backbones in the proposed deep regression architecture for inclination estimation (Scenario-1) DIE backbones 90° 75° 60° T1 T2 T3 T1 T2 T3 T1 T2 T3 ANN 71.55 51.05 39.33 70.85 52.02 41.70 70.42 51.25 30.83 VGG16 71.97 51.32 38.21 76.68 62.78 49.33 22.50 10.00 2.50 ConvNeXt 82.57 70.71 57.60 78.48 65.47 52.02 85.42 70.42 48.75 EfficientNet 86.05 77.27 60.81 86.55 74.44 61.43 87.92 84.58 77.08 Table 8 compares the performance of different backbones of the proposed deep regression network designed for inclination estimation at different thresholds. Overall, the performance of the EfficientNet backbone is better than other backbones for all the view angles. While EfficientNet provides an accuracy of 86% with a \(\:\pm\:5^\circ\:\) , error margin (T1), this angle estimation accuracy decreases as the threshold value decreases and holds true for all three view angles. All architectures perform relatively poorly when the inclination angle has a threshold of 1 degree. This comparison also shows the error resistance of the EfficientNet backbone in yielding accurate estimations. Also, it can be noted that while most of the backbone architectures perform well in different scenarios, the performance of the VGG16 architecture is very low. One of the reasons for this could be the non-availability of residual connections in VGG16, which help in training deeper networks by mitigating the vanishing gradient problem. The performance of the VGG16 architecture is similar to the ANN architecture except for the 60-degree camera view angle, where it gets distorted significantly. The performance of the ANN architecture is not as good as other deep architectures and can be attributed to its framework not being able to extract complex features. Table 9 Accuracy comparison of different backbones in the proposed deep regression architecture for inclination estimation (Scenario-2) Altitude (in m) ANN ConvNeXt VGG16 EfficientNet T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 3 25 0 0 25 0 0 25 0 0 75 75 75 6 92.9 78.6 71.4 85.7 71.4 64.3 85.7 71.4 64.3 92.9 92.9 85.7 9 61.4 47.7 38.6 70.5 61.4 31.8 59.1 38.6 22.7 77.3 63.6 59.1 12 70.6 55.9 38.2 82.4 64.7 38.2 82.4 73.5 55.9 88.2 85.3 82.4 15 81.8 72.7 72.7 90.9 90.9 63.6 81.8 72.7 36.4 90.9 90.9 90.9 18 69.2 53.8 53.8 92.3 92.3 69.2 92.3 76.9 53.8 100 100 100 21 66.7 50 44.4 83.3 72.2 66.7 88.9 77.8 72.2 100 88.9 88.9 24 60.9 43.5 34.8 78.3 65.2 52.2 78.3 60.9 47.8 82.6 82.6 78.3 27 75 50 37.5 56.3 50.0 43.8 62.5 43.8 18.8 81.3 68.8 68.8 Table 9 presents a comparative analysis of various backbone architectures in terms of prediction accuracy across different altitudes. The results reveal that prediction accuracy does not exhibit a consistent upward or downward trend with changes in altitude. For instance, at altitudes of 15 m and 18 m, most models achieve higher accuracy across all thresholds. Conversely, performance at the 3 m altitude is significantly lower, primarily because images captured at such low altitudes lack sufficient spatial context and surface detail required for accurate inclination estimation. Among the backbones integrated within the DIE architecture, the ANN model consistently demonstrates inferior performance, yielding the lowest prediction accuracy in nearly all scenarios. In contrast, ConvNeXt and VGG16 perform comparable to EfficientNet at the T1 and T2 thresholds; however, EfficientNet exhibits superior performance across all thresholds. Hence, the EfficientNet-based backbone emerges as the most suitable choice for reliable surface inclination estimation. Failure in detection / angle estimation Several instances demonstrate that surface detection and inclination estimation exhibit reduced reliability under certain conditions, as illustrated in Fig. 7 . The degradation in performance is particularly evident at higher altitudes or in scenes containing multiple surfaces with varying textures and inclinations. In the ROS Gazebo simulations, where three distinct camera orientations were evaluated, detection reliability declined significantly for slanted camera views. Under these conditions, the network often misclassified multiple inclined surfaces as a single surface or, in some cases, failed to detect any valid region. In the real UAV dataset, similar difficulties arise when processing high-altitude imagery or frames containing dense surface patterns. At greater altitudes, the captured images lack sufficient spatial resolution and edge definition, leading to blurred geometric cues essential for accurate surface discrimination. Moreover, overlapping textures, shadows, and illumination variations further reduce the model’s ability to extract meaningful features for segmentation. The regression-based angle estimation also suffers in these cases because the extracted feature maps lack clear geometric gradients or distinct surface boundaries, which are critical for predicting inclination. When the input features are either noisy or spatially ambiguous—as often occurs in high-altitude and slanted-view scenarios—the regression layer cannot establish a consistent mapping between appearance patterns and corresponding inclination angles. Consequently, the predicted angles exhibit higher variance and reduced correlation with ground truth data. These observations highlight that the proposed framework’s performance is highly dependent on camera viewpoint, altitude, and scene complexity. Addressing these limitations may involve strategies such as multi-view feature fusion, resolution-adaptive training, or incorporating depth priors to strengthen both detection and regression accuracy under varying imaging conditions. . Overall, the results suggest that the proposed ensemble framework is a promising approach for knowing potential landing sites by estimating the inclination angles of surfaces. By enabling the detection of multiple surface inclinations within a single frame, the framework provides a practical solution for UAVs to identify and land safely on inclined terrains. The experiments also demonstrate the robustness of the architecture in estimating inclination angles under varying camera view angles. Nevertheless, further improvements are necessary—specifically, incorporating the camera view angle as an explicit input parameter during training could enhance overall model efficiency. Additionally, the framework may be extended to estimate other critical surface attributes, such as roughness and steepness, to support more comprehensive and reliable landing site selection. 6. CONCLUSION This work introduces the UAV-Landing Inclination Dataset (UAV-LID), comprising of a synthetic ROS-Gazebo dataset with precise ground-truth labels and a real-world aerial dataset capturing diverse surfaces and altitudes. This work also proposes an ensemble learning framework for UAV-based surface inclination estimation, which can be used for autonomous landing site selection. By unifying surface detection with inclination angle regression, the framework demonstrates efficient performance. The UAV-LID dataset addresses a critical data gap and enables robust validation and domain transfer analysis. The framework’s lightweight design and reliance on monocular vision make it practical for resource-constrained UAV platforms where conventional sensing is prohibitive. Its applicability extends beyond landing site selection to terrain analysis, surveying, and environmental monitoring. Future work will expand UAV-LID and integrate additional landing parameters, environmental dynamics, and multi-modal fusion to further enhance terrain characterization and safe UAV autonomy. Declarations The research leading to these results received funding from DRDO – Aeronautical Research & Development Board for carrying out this activity. Author Contribution IN and DK conceptualized the idea, drafted the first draft, and carried out simulation, IN, DK, NB and SP carried out field experiments, DK and NB carried out experimentation on GPUs, IN and SP proposed the methodology and carried out the design of experiments, IN and SP carried out final review, and SP provided the necessary resources and funding for carrying out this work. ACKNOWLEDGEMENT The authors would like to acknowledge the funding support received from DRDO – Aeronautical Research & Development Board for carrying out this activity. The authors would also like to thank the Director CSIO for providing the necessary support. The authors would like to acknowledge DGCA and ATC for providing the necessary permissions for carrying out the drone flight activities. Data Availability The dataset discussed in the manuscript shall be published and available publicly at github repository once the article is published. References Gautam, A., Sujit, P.B., Saripalli, S.: A survey of autonomous landing techniques for UAVs, in international conference on unmanned aircraft systems (ICUAS) , IEEE, 2014, pp. 1210–1218. Accessed: Sept. 17, 2025. [Online]. Available: (2014). https://ieeexplore.ieee.org/abstract/document/6842377/ Yu, L., et al.: Deep learning for vision-based micro aerial vehicle autonomous landing. Int. J. Micro Air Veh. 10 (2), 171–185 (June 2018). 10.1177/1756829318757470 Xin, L., Tang, Z., Gai, W., Liu, H.: Vision-based autonomous landing for the UAV: A review. Aerospace. 9 (11), 634 (2022) Baidya, R., Jeong, H.: Simulation and real-life implementation of UAV autonomous landing system based on object recognition and tracking for safe landing in uncertain environments. Front. Rob. AI. 11 , 1450266 (2024) Saripalli, S., Montgomery, J.F., Sukhatme, G.S.: Vision-based autonomous landing of an unmanned aerial vehicle, in Proceedings IEEE international conference on robotics and automation (Cat. No. 02CH37292) , IEEE, 2002, pp. 2799–2804. Accessed: Sept. 17, 2025. [Online]. Available: (2002). https://ieeexplore.ieee.org/abstract/document/1013656/ Zhang, H., Zhao, J., American Society of Mechanical Engineers, T21A004: Vision based surface slope estimation for unmanned aerial vehicle perching, in Dynamic Systems and Control Conference , p. V002. Accessed: Sept. 17, 2025. [Online]. (2018). Available: https://asmedigitalcollection.asme.org/DSCC/proceedings-abstract/DSCC2018/51906/270966 ZHANG, Z., Quanrui, C., Qiufu, W., Xiaoliang, S.U.N., Qifeng, Y.U.: Monocular visual estimation for autonomous aircraft landing guidance in unknown structured scenes. Chin. J. Aeronaut., p. 103479, (2025) Kakaletsis, E., Nikolaidis, N.: Potential UAV Landing Sites Detection through Digital Elevation Models Analysis, July 14, arXiv : arXiv:2107.06921. (2021). 10.48550/arXiv.2107.06921 Garcia-Pulido, J.A., Pajares, G., Dormido, S., de la Cruz, J.M.: Recognition of a landing platform for unmanned aerial vehicles by using computer vision-based techniques. Expert Syst. Appl. 76 , 152–165 (2017) Lim, J., Kim, M., Yoo, H., Lee, J.: Autonomous multirotor UAV search and landing on safe spots based on combined semantic and depth information from an onboard camera and LiDAR. IEEE/ASME Trans. Mechatron. 29 (5), 3960–3970 (2024) Lin, S., Jin, L., Chen, Z.: Real-time monocular vision system for UAV autonomous landing in outdoor low-illumination environments. Sensors. 21 (18), 6226 (2021) Chatzikalymnios, E., Moustakas, K.: Landing Site Detection for Autonomous Rotor Wing UAVs Using Visual and Structural Information, J Intell Robot Syst , vol. 104, no. 2, p. 27, Feb. (2022). 10.1007/s10846-021-01544-6 Park, J., Kim, Y., Kim, S.: Landing site searching and selection algorithm development using vision system and its application to quadrotor. IEEE Trans. Control Syst. Technol. 23 (2), 488–503 (2014) Mittal, M., Mohan, R., Burgard, W., Valada, A.: Vision-Based Autonomous UAV Navigation and Landing for Urban Search and Rescue, in Robotics Research , vol. 20, T. Asfour, E. Yoshida, J. Park, H. Christensen, and O. Khatib, Eds., in Springer Proceedings in Advanced Robotics, vol. 20., Cham: Springer International Publishing, pp. 575–592. (2022). 10.1007/978-3-030-95459-8_35 Neves, F.S., Branco, L.M., Pereira, M.I., Claro, R.M., Pinto, A.M., A multimodal learning-based approach for autonomous landing of uav, in: 20th IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications (MESA) , IEEE, 2024, pp. 1–8. Accessed: Sept. 17, 2025. [Online]. Available: (2024). https://ieeexplore.ieee.org/abstract/document/10704866/ Marcu, A., Costea, D., Licaret, V., Pîrvu, M., Slusanschi, E., Leordeanu, M.: SafeUAV: Learning to estimate depth and safe landing areas for UAVs from synthetic data, in Proceedings of the European Conference on Computer Vision (ECCV) Workshops , pp. 0–0. Accessed: Sept. 17, 2025. [Online]. (2018). Available: https://openaccess.thecvf.com/content_eccv_2018_workshops/w7/html/Marcu_SafeUAV_Learning_to_estimate_depth_and_safe_landing_areas_for_ECCVW_2018_paper.html Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pp. 7464–7475. Accessed: Sept. 17, 2025. [Online]. (2023). Available: http://openaccess.thecvf.com/content/CVPR2023/html/Wang_YOLOv7_Trainable_Bag-of-Freebies_Sets_New_State-of-the-Art_for_Real-Time_Object_Detectors_CVPR_2023_paper.html Koonce, B., Network, V.G.G.: in Convolutional Neural Networks with Swift for Tensorflow, pp. 35–50. A, Berkeley, CA (2021). 10.1007/978-1-4842-6168-2_4 Woo, S., et al.: Convnext v2: Co-designing and scaling convnets with masked autoencoders, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pp. 16133–16142. Accessed: Sept. 18, 2025. [Online]. (2023). Available: http://openaccess.thecvf.com/content/CVPR2023/html/Woo_ConvNeXt_V2_Co-Designing_and_Scaling_ConvNets_With_Masked_Autoencoders_CVPR_2023_paper.html Koonce, B.: EfficientNet. In: Convolutional Neural Networks with Swift for Tensorflow, pp. 109–123. A, Berkeley, CA (2021). 10.1007/978-1-4842-6168-2_10 Shang, J., Zhang, K., Zhang, Z., Li, C., Liu, H.: A high-performance convolution block oriented accelerator for MBConv-Based CNNs. Integration. 88 , 298–312 (2023) Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 13 May, 2026 Reviews received at journal 09 May, 2026 Reviewers agreed at journal 12 Apr, 2026 Reviewers agreed at journal 11 Apr, 2026 Reviewers agreed at journal 07 Apr, 2026 Reviewers invited by journal 07 Apr, 2026 Editor assigned by journal 25 Mar, 2026 Submission checks completed at journal 29 Dec, 2025 First submitted to journal 29 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8472654","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":621465608,"identity":"d6927764-ed68-4c6a-ade6-3a47363480c3","order_by":0,"name":"Ishan narayan","email":"","orcid":"","institution":"CSIR - Central Scientific Instruments Organisation","correspondingAuthor":false,"prefix":"","firstName":"Ishan","middleName":"","lastName":"narayan","suffix":""},{"id":621465609,"identity":"8dfaf416-020e-4030-9ede-a3990512249a","order_by":1,"name":"Dapinder Kaur","email":"","orcid":"","institution":"Academy of Scientific and Innovative Research","correspondingAuthor":false,"prefix":"","firstName":"Dapinder","middleName":"","lastName":"Kaur","suffix":""},{"id":621465611,"identity":"48b0e46c-a8ef-4a67-80ab-f587e23649da","order_by":2,"name":"Neeraj Battish","email":"","orcid":"","institution":"CSIR - Central Scientific Instruments Organisation","correspondingAuthor":false,"prefix":"","firstName":"Neeraj","middleName":"","lastName":"Battish","suffix":""},{"id":621465612,"identity":"a24af631-9bac-4210-b82e-65cd97b956a5","order_by":3,"name":"Shashi Poddar","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABH0lEQVRIie2RP0vEMBiHf0fBW4JdW070K7xQKAjCfZUcgl0KFgSnoxRcBdfKfQnhVodA4FwOXCPeIoVODpUsDg4mVuhgCjc65FmSvMnD+yeAx/MfiYAAZDaC/5wZphXAf2/FfgoT+ygYFBPhrncDJ6ubRhfF7vjw6bztsCyPwrhpurdHXIahmMjir0K7TTqrqU3ibZvU2EgWry7SiLc4va85ZO1QIp4GjOTiQfEEk0owes0PwAWIFCCZo7A607pXMm2UktHLNuisMn8WTgUqp1mv5DaLyaiYmYDNYubmUkjl10axvbxfgdtebm0vIqJILaqRwtaafdmJZWt0y3IeTmXz8SnOKLyTUrsK6xH9woeI+S9T56gw8ssej8fj6fkG615pVwRIx5kAAAAASUVORK5CYII=","orcid":"","institution":"CSIR - Central Scientific Instruments Organisation","correspondingAuthor":true,"prefix":"","firstName":"Shashi","middleName":"","lastName":"Poddar","suffix":""}],"badges":[],"createdAt":"2025-12-29 11:38:39","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8472654/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8472654/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":106871367,"identity":"e139e889-0bdb-42cf-b781-67d3ae687b06","added_by":"auto","created_at":"2026-04-14 09:46:40","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":270308,"visible":true,"origin":"","legend":"\u003cp\u003eArchitecture of the proposed inclination angle estimation pipeline\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/25a82c31fb6849320f1342b7.png"},{"id":106961535,"identity":"6ebe0b1c-062a-4f6c-ae70-f34bc5aa15f7","added_by":"auto","created_at":"2026-04-15 09:25:56","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":104297,"visible":true,"origin":"","legend":"\u003cp\u003eSynthetic dataset generation pipeline using ROS framework\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/92b6bcf2989c1b5338dea3d9.png"},{"id":106994183,"identity":"a2412d49-5f0a-48c7-bf64-c63c921e1547","added_by":"auto","created_at":"2026-04-15 15:06:06","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1129140,"visible":true,"origin":"","legend":"\u003cp\u003eDataset generation and labelling framework\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/ef319b98f81f38f0777551e9.png"},{"id":106960638,"identity":"d9029183-77cd-4cf8-a3af-20c47bf035db","added_by":"auto","created_at":"2026-04-15 09:22:23","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":3510640,"visible":true,"origin":"","legend":"\u003cp\u003eDifferent type of inclined surfaces for real dataset generation\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/bb3a43e4cc637d0a3c59125c.png"},{"id":106871370,"identity":"127d7d67-287a-4be4-8e57-8fd770b1df62","added_by":"auto","created_at":"2026-04-14 09:46:40","extension":"jpeg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":150014,"visible":true,"origin":"","legend":"\u003cp\u003eRepresenting the image distribution: a) Number of images with respect to surface angles, b) Number of images at different heights\u003c/p\u003e","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/8bde5114cf99b57c927e8fea.jpeg"},{"id":106871371,"identity":"ebb88fa2-b683-4a3b-b191-cb89e100a3c8","added_by":"auto","created_at":"2026-04-14 09:46:40","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1224573,"visible":true,"origin":"","legend":"\u003cp\u003eSurface prediction result with trained YOLOv7 model: (a-c) Synthetic Images; (d-f) Real world images\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/753f5add87df0c7f858b9f9a.png"},{"id":106871373,"identity":"62b6a8d8-38c3-4e62-a915-8f59d720c0d2","added_by":"auto","created_at":"2026-04-14 09:46:40","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":2487496,"visible":true,"origin":"","legend":"\u003cp\u003eSurface detection performance variation with different camera tilt angles: (a) - (d) Multiple surfaces are incorrectly merged and detected as a single region with some surfaces remain undetected, (e) - (h) Presence of multiple surfaces within a single frame leads to missed detections or incomplete surface extraction\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/7676dc97f5291b27e0bf8489.png"},{"id":106994918,"identity":"b067332d-34cd-4a0b-83a1-6771b6e93817","added_by":"auto","created_at":"2026-04-15 15:20:32","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":12188111,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8472654/v1/17f81e4b-d579-4c23-a1ca-3dab52416c89.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"UAV-Landing Inclination Dataset: Enabling Inclination-Aware Surface Detection from UAV","fulltext":[{"header":"1. INTRODUCTION","content":"\u003cp\u003eAccurate surface inclination estimation is critical in various applications, such as drone based solar panel inspection, autonomous UAV landing, and drone-based delivery. Landing site selection is an important aspect of autonomous UAV which uses 3D point cloud to represent terrain information and assess the feasibility of landing on a site. Among several landing sites, rooftops present a unique challenge due to their diverse geometries and varying inclination angles, affecting the interpretation of aerial imagery. Accurately determining the inclination angles of surfaces, particularly rooftops, is crucial for improving the precision of autonomous vision-based landing. Among several architectures available for estimating terrains, commonly used approaches include combination of LIDAR and vision-based systems. LIDAR being computationally intensive, are not suitable for small form factor UAV systems and have several other limitations. Vision based landing site selection has therefore gained popularity over the last few years and aruco-markers based landing is most common. However, this cannot be generalized for unknown surfaces at different inclination. It is therefore necessary to devise mechanisms by which the vision-based landing site selection can be carried out without the need of any existing markers.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThis work introduces an ensemble-based learning approach that aims at estimating the inclination angle of any detected surface from the downward looking camera mounted on a UAV. Although several architectures exist for autonomous driving scenarios, much less has been explored for the top-down views from the UAV and is still an evolving field. Consequently, the lack of comprehensive datasets tailored to this problem has limited the advancements in this field. Most existing datasets focus on general object detection or segmentation, while the estimation of surface parameters like inclination and roughness still remains an ill-poised problem. Therefore, a novel dataset specifically designed to address the challenge of surface inclination estimation has been generated and experimented here. Both real and synthetic dataset has been prepared comprising of high-resolution synthetic and real UAV images of surfaces and rooftops with annotated bounding boxes and corresponding inclination angles. This dataset aims to fill a critical gap in aerial data analytics needed for autonomous UAV operations by providing a benchmark for evaluating machine learning and computer vision models. The synthetic dataset is created using the ROS (Robot Operating System) Gazebo simulation platform while the real-world simulation uses an industrial quadcopter for capturing images. This dataset is a valuable resource for researchers focusing on inclination estimation and lays the groundwork for advancing downstream tasks like surface identification, structural monitoring, and precision planning. The diversity of scenes and rooftop types in both the simulated and real-world environment ensures that the dataset can generalize well to real-world applications. Accordingly, the main contribution of this paper are (a) Ensemble based machine learning approach to estimate surface inclination angle; and, (b) real-world and synthetic dataset of different surface types with different inclination angles.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe introduce the UAV-Landing Inclination Dataset (UAV-LID), a benchmark dataset that provides precise ground-truth inclination angle annotations of potential landing surfaces for UAVs. Unlike existing datasets that classify surfaces coarsely (e.g., flat vs. non-flat), UAV-LID offers fine-grained, degree-level inclination labels across diverse terrains, enabling robust development of learning-based safe landing and slope estimation algorithms. The rest of the paper is organized as follows: section 2 discusses related work in inclination estimation and existing datasets; Section 3 provides a detailed overview of the dataset, including its structure and generation process; Section 4 details the proposed methodology for surface detection and inclination estimation; section 5 analyses the proposed ensemble approach for inclination estimation using different backbone architectures, and finally, section 5 concludes the paper and outlines future directions.\u003c/p\u003e"},{"header":"2. BACKGROUND AND RELATED WORK","content":"\u003cp\u003eAutonomous landing site selection for unmanned aerial vehicles (UAVs) represents one of the most critical and challenging aspects of autonomous flight operations, requiring sophisticated decision-making capabilities across diverse and unpredictable terrain environments [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The landing site identification can be on a natural unknown surface and on pre-known marked sites. While marker-based approaches offer higher precision and reliability in controlled environments, they are inherently limited by their dependency on pre-installed infrastructure. For unstructured terrains, the selection mechanism often integrates multiple visual and sensor-derived factors into a weighted decision framework [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eRecent advances in autonomous landing systems integrate sophisticated computer vision techniques, semantic segmentation networks, and pose estimation algorithms to enhance landing site identification [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e] [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. This work emphasizes on the image-based terrain parameter estimation, specifically, inclination which is considered as the most important factor for landing site identification. Relying solely on RGB sensors and annotated labels, reduces hardware complexity and processing overhead.\u003c/p\u003e \u003cdiv id=\"Sec2\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Traditional Terrain parameter estimation\u003c/h2\u003e \u003cp\u003eMost of the vision-based architectures aim at multiple feature estimation from the images captured by the camera, such as texture, color, shape, and geometric cues that can determine the suitability of candidate landing sites. Early methods focused on edge detection, gradient operators and non-maximum suppression processes to estimate local slopes and surface normals from images [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. In [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], the authors devised a technique for vision-based surface slope estimation for UAV perching using a monocular camera to estimate slope and may struggle with noisy real-world imagery or textured environments. Other traditional approaches employed pose estimation and keypoint extraction, inferring tilt or orientation by analyzing geometric transformations between image features and the camera pose [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Some terrain parameter estimation pipelines also utilize elevation models, where slope is defined by differences in grid cell elevation across a rasterized representation of the terrain [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. While such methods produce good results, their accuracy and robustness depend critically on the availability and accuracy of external depth maps or LiDAR measurements. This reliance limits their applicability in lightweight, vision-only UAV deployments or where real-time operation is needed [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] .\u003c/p\u003e \u003cp\u003eLim et al. [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e] proposed a UAV framework that fuses semantic segmentation and depth data from an onboard RGB camera and LiDAR sensor to identify and land on safe spots in complex environments. In [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e], the pipeline for UAV landing site selection compares different candidate landing site options by evaluating the terrain characteristics, such as slope and roughness, to determine the safest and most feasible landing locations. Chatzikalymnios and Moustakas [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] proposed a vision-based framework to analyze rooftop environments, where Gaussian processes are used to model feasible landing zones and ensures that UAVs not only identify candidate regions but also quantify confidence in their suitability. Such approaches highlight the importance of integrating machine learning and structural priors into landing site evaluation, offering a scalable solution for real-world autonomous operations. There are still several disadvantages in extracting surface information from 3D point cloud data. In remote sensing applications, estimating surface roughness using satellite imagery and multispectral imaging is an essential aspect, requiring surface elevation models to be known.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.2 Deep learning-based terrain parameter estimation\u003c/h2\u003e \u003cp\u003eIntegrating deep learning architectures into UAV landing site detection has been benchmarked with unprecedented accuracy and robustness compared to traditional vision-based approaches. Park et al. [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] developed a stereo vision-based landing site search algorithm employing convolutional neural networks (CNNs) to identify flat, obstacle-free regions suitable for landing. Similarly, Mittal et al. [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] employs encoder-decoder networks with skip connections to extract multi-scale features from stereo pairs, enabling accurate depth estimation and terrain classification. Neves et al. [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e] proposed a multimodal transformer-based architecture that processes RGB, thermal, and depth information to provide global context understanding, crucial for distinguishing safe landing zones. Marcu et al. [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e] introduced SafeUAV, an encoder\u0026ndash;decoder CNN framework that leverages synthetic training data to learn depth estimation and safe landing area prediction for UAVs.\u003c/p\u003e \u003cp\u003eDeploying a trained CNN-based model for landing-site recognition significantly reduces onboard hardware overhead and overall payload constraints, while eliminating the need for additional communication protocols to maintain maneuverability using terrain cues alone. Model pruning and compression further enable efficient deployment on resource-constrained edge devices commonly used in UAV platforms. However, practical deployment necessitates a systematic investigation of achievable inference frame rates on embedded hardware, as well as the varying spatial extent of the scene to be processed at different flight altitudes. Furthermore, real-world operations often involve non-nadir camera configurations due to vehicle dynamics and environmental disturbances. To address this, the proposed dataset incorporates diverse camera tilt and viewpoint variations, ensuring robustness of the model under realistic flight conditions.\u003c/p\u003e \u003c/div\u003e"},{"header":"3. METHODOLOGY","content":"\u003cp\u003eThis work aims to provide an architectural pipeline that can be used to predict the inclination angle of the surface from an aerial view. An ensemble learning approach is devised here which consists of two parts, that is, Deep learning-based Surface Detection (DSD) and a Deep regression network for Inclination Estimation (DIE). Ensemble learning has been widely utilized to enhance model performance by integrating multiple architectures in one framework. The DSD architecture detects different possible surfaces in an image frame and provides the corresponding bounding box coordinates, and class labels. Object detection methods, particularly YOLO (You Only Look Once) variants, demonstrate significant capabilities for different use cases. One of the variants, the YOLOv7 [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e] architecture has been used here for detection of surfaces and the bounding box coordinates of different surfaces as training data. The detected surface contours are then cropped from these frames and passed onto the next stage where their inclination is estimated. The deep neural architecture with different backbones such as artificial neural networks (ANN), VGG[\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e], ConvNeXt [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e], and EfficientNet [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e] architectures have been incorporated in the DIE framework along with linear regression to yield inclination angle from these cropped image portions.\u003c/p\u003e \u003cp\u003eArtificial Neural Networks (ANNs) typically consist of fully connected architectures that map inputs to high-level representations. Although widely applied across diverse problem domains, such shallow networks are limited in their ability to generate complex features compared to modern deep learning architectures. Among deep learning backbones, VGG16 is a widely used framework that employs small convolutional kernels stacked hierarchically to capture fine-grained spatial details, making it a robust model for image classification and feature extraction. EfficientNet, on the other hand, introduces compound scaling to jointly optimize depth, width, and resolution, and leverages inverted bottleneck layers (MBConv) [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e] with squeeze-and-excitation (SE) modules to efficiently capture complex representations while maintaining high accuracy. More recently, ConvNeXt incorporates depthwise convolutions, enhanced normalization, and transformer-inspired design refinements, thereby balancing the efficiency of convolutional networks with the representational capacity of vision transformers. In the context of UAV landing perception, such a design is particularly advantageous as it generalizes well across diverse terrain features and scales effectively to high-resolution aerial imagery. These properties make ConvNeXt a strong backbone for reliable inclination estimation, a key requirement for safe autonomous landing. Collectively, these architectures have been selected for their proven utility across a range of computer vision tasks, with each offering unique features that enable comparative insights. In the proposed Deep Inclination Estimator (DIE), these backbone architectures is employed to extract high-level features, followed by a regression head that predicts precise inclination angle values.\u003c/p\u003e \u003cp\u003eThe DSD architecture processes visual images captured by the camera and generates a set of bounding boxes that encapsulate the surface. Each bounding box also outputs a class label (indicating a surface) and a confidence score. The bounding box coordinates (\u003cem\u003ex, y, width, height\u003c/em\u003e) and the associated labels are then passed to the next stage of the pipeline. These regions are resized and normalized to match the input requirements of the DIE architecture. The DIE model processes these images through multiple convolutional layers and other architectural components, extracting high-level features that capture surface characteristics. A regression layer at the final stage predicts the surface angle as an output in degrees. During training, the network minimizes the difference between the predicted and the ground truth angles through backpropagation, optimizing its ability to estimate inclination angle accurately. During inference, the trained DSD model takes an input image, applies YOLOv7 to detect surfaces, and then uses the corresponding bounding box coordinates to extract the relevant surface regions. These regions are passed on to the DIE model, which yields the predicted angle for each detected surface. The proposed ensemble approach is novel in terms of its arrangement and does not require complex architectures to detect surface and its angle estimation in one single framework. This architecture can be further improved by interchanging better YOLO versions or better regression architecture in future, as the need arises. The overall architecture of the proposed ensemble learning framework consists of sub-modules for detection of potential surfaces from the image, cropping the detected surfaces with additional padding on all sides, followed by their input into the DIE model as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"4. DATASET GENERATION","content":"\u003cp\u003eDatasets form an essential component for carrying out image-based data analytics operation and the dearth of real-world aerial data forces users to rely on synthetic data alone. Synthetic datasets for UAV and aerial scenarios primarily target traditional computer vision tasks including object detection, semantic segmentation, depth estimation, and classification. Some of the synthetic datasets such as: UAV-City dataset targets semantic segmentation; UAVScenes dataset [26] provides semantic annotations for both images and LiDAR point clouds, and Air2Land dataset [27] addresses UAV landing scenarios in different lighting conditions. Despite the abundance of synthetic datasets, a critical gap persists: the absence of datasets specifically designed for surface inclination estimation which is a fundamental parameter for UAV landing site selection. To address this gap, inclination-focused real and synthetic image dataset has been generated here and is termed as UAV-Landing Inclination Dataset (UAV-LID). It incorporates precise ground truth labels and inclination angles for visible surface patches, enabling quantitative evaluation of inclination estimation accuracy. This section provide details of the experimental procedures undertaken for generating the synthetic data and preparing the real-world dataset.\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eSynthetic dataset generation\u003c/strong\u003e \u003cp\u003eThis pipeline consists of creating a simulation environment in ROS-Gazebo using different surfaces with predefined angles of inclination of surfaces captured using visual camera mounted on a UAV from different altitudes. Ardupilot was used for the flight controller's software in the loop plugin availability, assisting in running the framework directly on the PC without any external hardware requirement. Different information parameters, like height, velocity, etc., are sent through inertial measurement unit (IMU) to the ROS Node, enabling the UAV camera to record ground-looking visual images. The UAV camera has been positioned to look at the ground downward at different angles and a live feed is obtained from the UAV camera and recorded using ROS as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. A brief overview of different environment types and the parameters considered for dataset generation is provided in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Here, inclination angles refer to the inclination of different surfaces in the image. The surfaces in the rural scenario consist mostly of huts, whereas the surfaces in semi-urban and urban scenarios consist of buildings and houses at different inclinations. The huts are textured as made of straw, semi-urban house tops are textured as made of bricks/ concrete, and the urban buildings are textured to be made of concrete/ wood so as to create a real world landing scenario during drone based deliveries. A mixture of textures using different materials, patterns and background colors help in generating diversity in the image as texture plays an important role in shape representation.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eParameters considered for dataset generation\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSimulation environment\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSurface type\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eInclination angles\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eHeight of UAV\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUrban\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eBuildings, apartments, House, gas station\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0֯, 15֯, 20֯, 25֯, 30֯, 35֯, 45֯, 55֯, 57֯\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e25, 40, 50 m\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRural\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHut\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e15֯, 30֯, 45֯, 60֯\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e25, 40, 50 m\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSemi-Urban\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHut, House\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e55֯, 57֯\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e25, 40, 50 m\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe roof-top contour represents the landing surface for simplicity, and the inclination angle of the surface is the ground truth value known apriori. The overall pipeline of dataset capturing, and image labelling is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. The label annotations include inclination angle for different surfaces and has been labelled manually for each image and is available online along with the dataset. The ground truth data is organized in CSV format with the following fields: image_name, \u003cem\u003ex, y, w, h\u003c/em\u003e and corresponding inclination angle.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe synthetic dataset includes the images captured by a UAV in three different view angles with three different surroundings, as mentioned in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eNumber of images generated in the dataset for different environment settings\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eEnvironment\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eView Angle\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e90\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e75\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo. of images (Total)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1027 (58%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e470 (26.5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e270 (15.2%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUrban (51.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e459\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e330\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e117\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSemi Urban (12.05%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e105\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e55\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e53\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRural (36.68%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e463\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e85\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e100\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe dataset contains 1767 images in total, from which 51% of the data is from urban settings,12% from semi-urban settings, and 37% from rural environment settings. The images captured in different environments contain a minimum of one to a maximum of eight different surfaces with different inclinations. Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e contains the details for the total number of objects (surfaces) present in the images captured in different environments with different camera view angles. Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e contain the details for the total number of objects present in the images captured in different environments with different camera view angles.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eTotal number of objects presented in the images captured in different environmental settings\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eEnvironment\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eView Angle\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e90\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e75\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUrban\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1360\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e621\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e239\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSemi-urban\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e298\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e238\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e490\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRural\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e942\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e252\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e469\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eReal-world dataset generation\u003c/strong\u003e \u003cp\u003eThe real-world data collection experiments were conducted in full compliance with the local laws for flying drones for experiment and dataset collection. Aerial videos and images were captured under these guidelines to ensure data quality, adherence to safety standards, and the generation of clear, usable datasets that can be further extended for similar applications. For this purpose, an industrial drone was used to capture top-down images of the surfaces from different altitudes. Several wooden slabs with printed sheets on top of it were used to represent different surface textures as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e. These slabs were positioned at multiple inclination angles with respect to the ground plane, ensuring that a wide range of slope variations were incorporated. The angle of inclination was set using an inclinometer and the UAV was flown on top of it to capture images at different heights. This procedure ensured that both scale variations and perspective consistency were adequately captured. This experimental setup enabled the creation of a comprehensive dataset covering a wide range of surface types and inclination angles. The non-blurred images were labeled manually for ground truthing purposes and sent to the inclination estimation framework.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe real-world dataset consists of both single objects in a frame and multiple objects in a frame for both training and test data. The UAV camera angle is fixed here at 90⁰ downwards, whereas surface angles are variable in the collected data. Almost two images are captured for each surface at each altitude with different surface angles. Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e presents the details of the dataset images with respect to surface angle and altitudes.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn addition to this, a 3-minute video, which consists of multiple surfaces in one frame with different inclination angles, was recorded to analyze the performance of both the detection and inclination estimation pipeline. This video contained around 2000 usable frames, which are also included in the training and testing data.\u003c/p\u003e"},{"header":"5. EXPERIMENTATION","content":"\u003cp\u003eThis research work proposes an ensemble deep learning pipeline to detect and determine the ground surfaces and analyze their safety with respect to UAV landing. The experimentations are rigorously performed for the ensemble architecture in estimating inclination angle of surfaces for both the real world and the synthetic dataset. All experimentations have been performed on a Windows-based platform using Python programming with different computer vision and deep learning libraries. This section provides details of the experimentations, including the training parameters, results, and comparative analysis.\u003c/p\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003e5.1 Training Parameters\u003c/h2\u003e \u003cp\u003eThis ensemble pipeline is trained on a workstation with NVIDIA RTX A4000 (16GB) GPU and 36GB RAM. For experimentation, 70% of the total data is used for training, 10% for validation, and 20% for testing. The other training parameters for ensemble pipeline are given in Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eTraining Parameters for the ensemble pipeline\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eParameter\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eValues\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colspan=\"2\" nameend=\"c2\" namest=\"c1\"\u003e \u003cp\u003eSurface Detection Architecture\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eInput\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e640\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBatch Size\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e16\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEpochs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLearning Rate\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.01\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOptimizer\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eADAM\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c2\" namest=\"c1\"\u003e \u003cp\u003e\u003cb\u003eInclination Estimation Architecture\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eInput\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e64\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBatch Size\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e32\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEpochs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOptimizer\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eADAM\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e5.2 Experimentation Results\u003c/h2\u003e \u003cp\u003eThe experimentation is conducted for both detection and regression architecture separately in order to evaluate the efficacy of the proposed ensemble pipeline. This section provides qualitative and quantitative analysis for both the architectures.\u003c/p\u003e \u003cp\u003e \u003cb\u003ePerformance Analysis of Detection Pipeline\u003c/b\u003e \u003c/p\u003e \u003cp\u003eThe surface detection task is performed using the YOLOv7 architecture, and its performance is evaluated through standard detection metrics, including precision, recall, mean average precision (mAP) at various thresholds, and detection rate. Precision quantifies the proportion of correctly predicted instances among all predicted positives, representing the model\u0026rsquo;s ability to avoid false detections. Recall, on the other hand, measures the proportion of actual positive instances that are correctly identified, reflecting the model\u0026rsquo;s sensitivity. The
[email protected] corresponds to the Mean Average Precision computed at an Intersection over Union (IoU) threshold of 0.5, where a detection is considered correct if the overlap between the predicted and ground-truth bounding boxes is at least 50% (IoU\u0026thinsp;\u0026ge;\u0026thinsp;0.5). Additionally, the
[email protected]:0.95 metric averages precision across multiple IoU thresholds ranging from 0.5 to 0.95 in increments of 0.05, providing a more comprehensive measure of detection robustness. The analysis for both synthetic and real-world datasets based on these metrics are summarized in Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eOverall performance analysis of surface detection on synthetic and real-world data based on different performance metrics\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"9\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colspan=\"4\" nameend=\"c5\" namest=\"c2\"\u003e \u003cp\u003eSynthetic Data\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"4\" nameend=\"c9\" namest=\"c6\"\u003e \u003cp\u003eReal-World Data\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003ePrecision\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003eRecall\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003emAP\u003c/b\u003e\u003csup\u003e\u003cb\u003e0.5\u003c/b\u003e\u003c/sup\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003emAP\u003c/b\u003e\u003csup\u003e\u003cb\u003e0.5\u0026ndash;0.95\u003c/b\u003e\u003c/sup\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003ePrecision\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003eRecall\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e\u003cb\u003emAP\u003c/b\u003e\u003csup\u003e\u003cb\u003e0.5\u003c/b\u003e\u003c/sup\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e\u003cb\u003emAP\u003c/b\u003e\u003csup\u003e\u003cb\u003e0.5\u0026ndash;0.95\u003c/b\u003e\u003c/sup\u003e\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAt 300 epoch\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.8656\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9221\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9238\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.6468\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.97\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.9939\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e0.9902\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e0.7621\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAverage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.8237\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.8258\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.8257\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.5168\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.8899\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.8909\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e0.8946\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e0.6255\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe precision results at the 300th epoch, along with their average values, demonstrate strong precision performance, indicating that most surface regions are accurately detected. However, the observation of high recall and elevated mAP values at the final epoch, coupled with comparatively lower average values, suggests slight overfitting of the model. This issue can be mitigated through model fine-tuning or by enhancing dataset diversity to improve generalization. The performance of the YOLOv7 model for surface detection is illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThree distinct environmental settings\u0026mdash;urban, semi-urban, and rural\u0026mdash;along with three camera orientations (90\u0026deg;,75\u0026deg;, and 60\u0026deg;) were considered for data simulation using the ROS framework. The surface detection outcomes for these inclination angles are illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e(a\u0026ndash;c). As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e(a), surfaces featuring multiple inclination angles are successfully identified as separate surfaces. In contrast, Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e(b) presents a false positive case, where two surfaces with different inclinations are incorrectly detected as a single surface. surfaces with well-defined structural boundaries are detected with high confidence, as evident in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e(c). Figures\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e(d \u0026ndash; f) depict real-world UAV-captured surfaces, the distinct surface separation results in relatively higher detection accuracy, thereby minimizing false detections.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab6\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDetection accuracy (in %) for surface detection using YOLOv7 in different environments and views for synthetic dataset\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eEnvironment\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eView Angle\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e90\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e75\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eUrban\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e97.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e96.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e89.05\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eSemi-urban\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e95.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e83.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e90.3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRural\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e96.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e95.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e94.1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e provides the surface detection accuracy for different environments and view angles. As seen, the accuracy for the 90\u0026deg; view angle is highest as compared to the other two view angles of the UAV. This can be attributed to the improper mapping of surfaces in the camera at different view angles. These detected surface regions are used as an input to the regression network to detect their inclination angle. The proposed deep regression network architecture with different backbones is trained using the original ground truth inclination values to balance the models\u0026rsquo; capabilities. The training parameters given in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e are kept same for all regression networks and computations are performed accordingly. In real-world data, the overall detection accuracy of 94.5% is achieved with a camera view angle of 90⁰.\u003c/p\u003e \u003cp\u003e \u003cb\u003ePerformance Analysis of Inclination Estimation Pipeline\u003c/b\u003e \u003c/p\u003e \u003cp\u003eThe proposed architecture is evaluated for both synthetic and real-world data in separate experiments. The performance in terms of inclination angle prediction is analyzed for different deep learning backbones such as ANN, VGG16, ConvNext, and EfficientNet. Their performance in terms of accuracy is computed at three different error thresholds, i.e., T1 \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:(\\pm\\:5^\\circ\\:)\\)\u003c/span\u003e\u003c/span\u003e, T2, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:(\\pm\\:2^\\circ\\:)\\)\u003c/span\u003e\u003c/span\u003e, and T3 \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:(\\pm\\:1^\\circ\\:)\\)\u003c/span\u003e\u003c/span\u003e. Here, the error threshold defines that the predicted inclination angle is within a specified range of the true value defined by T1, T2, and T3. Table\u0026nbsp;\u003cspan refid=\"Tab7\" class=\"InternalRef\"\u003e7\u003c/span\u003e compares the overall performance of surface inclination angle prediction for both synthetic and real-world data.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab7\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 7\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eAccuracy comparison of different backbones in the proposed deep regression architecture for inclination estimation\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eDIE backbones\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eSynthetic Data\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c7\" namest=\"c5\"\u003e \u003cp\u003eReal-World Data\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eT1\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003eT2\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003eT3\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003eT1\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003eT2\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003eT3\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eANN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e72.37\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e54.81\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e35.97\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e68.36\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e52.54\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e43.50\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eVGG16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e25.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e15.78\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e9.23\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e74.58\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e59.32\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e42.94\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eConvNext\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e84.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e73.77\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e60.81\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e76.84\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e66.10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e46.89\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEfficientNet\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003e85.97\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e77.3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e63.49\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e86.44\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e80.23\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e77.40\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eIt can be seen from Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e that the performance of VGG16-based regression is different in both synthetic and real-world data. As seen, the accuracy of VGG16 in predicting inclination angle in the case of synthetic data is very low. However, its performance is relatively similar to ANN and ConvNext with real-world data. The EfficientNet-based DIE architecture shows good accuracy compared to other architectures for both datasets and is therefore the preferred choice here. However, this backbone can be replaced with newer backbones as they become available in the literature and is only indicative of the performance achieved through this ensemble architecture.\u003c/p\u003e \u003cp\u003eThe performance is further analyzed in two different scenarios, where scenario-1 provides a comparative analysis based on different view angles using synthetic data, and scenario-2 gives a performance analysis based on different altitudes for real-world data. Tables\u0026nbsp;\u003cspan refid=\"Tab8\" class=\"InternalRef\"\u003e8\u003c/span\u003e and \u003cspan refid=\"Tab9\" class=\"InternalRef\"\u003e9\u003c/span\u003e display the results for these scenarios, respectively.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab8\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 8\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eAccuracy comparison of different backbones in the proposed deep regression architecture for inclination estimation (Scenario-1)\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"10\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eDIE backbones\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003e90\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c7\" namest=\"c5\"\u003e \u003cp\u003e75\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c10\" namest=\"c8\"\u003e \u003cp\u003e60\u0026deg;\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eT1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eT2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eT3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eT1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eT2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eT3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003eT1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003eT2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003eT3\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eANN\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e71.55\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e51.05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e39.33\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e70.85\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e52.02\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e41.70\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e70.42\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e51.25\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e30.83\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eVGG16\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e71.97\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e51.32\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e38.21\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e76.68\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e62.78\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e49.33\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e22.50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e10.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e2.50\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eConvNeXt\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e82.57\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e70.71\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e57.60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e78.48\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e65.47\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e52.02\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e85.42\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e70.42\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e48.75\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eEfficientNet\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003e86.05\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e77.27\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e60.81\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e86.55\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e74.44\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e61.43\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e\u003cb\u003e87.92\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e\u003cb\u003e84.58\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e\u003cb\u003e77.08\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab8\" class=\"InternalRef\"\u003e8\u003c/span\u003e compares the performance of different backbones of the proposed deep regression network designed for inclination estimation at different thresholds. Overall, the performance of the EfficientNet backbone is better than other backbones for all the view angles. While EfficientNet provides an accuracy of 86% with a \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\pm\\:5^\\circ\\:\\)\u003c/span\u003e\u003c/span\u003e, error margin (T1), this angle estimation accuracy decreases as the threshold value decreases and holds true for all three view angles. All architectures perform relatively poorly when the inclination angle has a threshold of 1 degree. This comparison also shows the error resistance of the EfficientNet backbone in yielding accurate estimations. Also, it can be noted that while most of the backbone architectures perform well in different scenarios, the performance of the VGG16 architecture is very low. One of the reasons for this could be the non-availability of residual connections in VGG16, which help in training deeper networks by mitigating the vanishing gradient problem. The performance of the VGG16 architecture is similar to the ANN architecture except for the 60-degree camera view angle, where it gets distorted significantly. The performance of the ANN architecture is not as good as other deep architectures and can be attributed to its framework not being able to extract complex features.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab9\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 9\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eAccuracy comparison of different backbones in the proposed deep regression architecture for inclination estimation (Scenario-2)\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"15\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c11\" colnum=\"11\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c12\" colnum=\"12\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c13\" colnum=\"13\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c14\" colnum=\"14\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c15\" colnum=\"15\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eAltitude (in m)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eANN\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c7\" namest=\"c5\"\u003e \u003cp\u003eConvNeXt\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"4\" nameend=\"c11\" namest=\"c8\"\u003e \u003cp\u003eVGG16\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"4\" nameend=\"c15\" namest=\"c12\"\u003e \u003cp\u003eEfficientNet\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eT1\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003eT2\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003eT3\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003eT1\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003eT2\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003eT3\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e\u003cb\u003eT1\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e\u003cb\u003eT2\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e\u003cb\u003eT3\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e\u003cb\u003eT1\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c13\"\u003e \u003cp\u003e\u003cb\u003eT2\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c14\"\u003e \u003cp\u003e\u003cb\u003eT3\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e25\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e25\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e25\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e75\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e75\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e75\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e92.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e78.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e71.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e85.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e71.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e64.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e85.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e71.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e64.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e92.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e92.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e85.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e61.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e47.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e38.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e70.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e61.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e31.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e59.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e38.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e22.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e77.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e63.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e59.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e70.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e55.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e38.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e82.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e64.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e38.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e82.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e73.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e55.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e88.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e85.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e82.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e81.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e72.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e72.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e90.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e90.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e63.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e81.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e72.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e36.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e90.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e90.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e90.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e18\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e69.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e53.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e53.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e92.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e92.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e69.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e92.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e76.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e53.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e100\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e100\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e100\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e21\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e66.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e44.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e83.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e72.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e66.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e88.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e77.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e72.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e100\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e88.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e88.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e60.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e43.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e34.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e78.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e65.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e52.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e78.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e60.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e47.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e82.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e82.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e78.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e27\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e75\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e37.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e56.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e50.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e43.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e62.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e43.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e18.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c12\" namest=\"c11\"\u003e \u003cp\u003e81.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e68.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c14\"\u003e \u003cp\u003e68.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c15\" namest=\"c15\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab9\" class=\"InternalRef\"\u003e9\u003c/span\u003e presents a comparative analysis of various backbone architectures in terms of prediction accuracy across different altitudes. The results reveal that prediction accuracy does not exhibit a consistent upward or downward trend with changes in altitude. For instance, at altitudes of 15 m and 18 m, most models achieve higher accuracy across all thresholds. Conversely, performance at the 3 m altitude is significantly lower, primarily because images captured at such low altitudes lack sufficient spatial context and surface detail required for accurate inclination estimation. Among the backbones integrated within the DIE architecture, the ANN model consistently demonstrates inferior performance, yielding the lowest prediction accuracy in nearly all scenarios. In contrast, ConvNeXt and VGG16 perform comparable to EfficientNet at the T1 and T2 thresholds; however, EfficientNet exhibits superior performance across all thresholds. Hence, the EfficientNet-based backbone emerges as the most suitable choice for reliable surface inclination estimation.\u003c/p\u003e \u003cp\u003e \u003cb\u003eFailure in detection / angle estimation\u003c/b\u003e \u003c/p\u003e \u003cp\u003eSeveral instances demonstrate that surface detection and inclination estimation exhibit reduced reliability under certain conditions, as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e. The degradation in performance is particularly evident at higher altitudes or in scenes containing multiple surfaces with varying textures and inclinations. In the ROS Gazebo simulations, where three distinct camera orientations were evaluated, detection reliability declined significantly for slanted camera views. Under these conditions, the network often misclassified multiple inclined surfaces as a single surface or, in some cases, failed to detect any valid region. In the real UAV dataset, similar difficulties arise when processing high-altitude imagery or frames containing dense surface patterns. At greater altitudes, the captured images lack sufficient spatial resolution and edge definition, leading to blurred geometric cues essential for accurate surface discrimination. Moreover, overlapping textures, shadows, and illumination variations further reduce the model\u0026rsquo;s ability to extract meaningful features for segmentation. The regression-based angle estimation also suffers in these cases because the extracted feature maps lack clear geometric gradients or distinct surface boundaries, which are critical for predicting inclination. When the input features are either noisy or spatially ambiguous\u0026mdash;as often occurs in high-altitude and slanted-view scenarios\u0026mdash;the regression layer cannot establish a consistent mapping between appearance patterns and corresponding inclination angles. Consequently, the predicted angles exhibit higher variance and reduced correlation with ground truth data. These observations highlight that the proposed framework\u0026rsquo;s performance is highly dependent on camera viewpoint, altitude, and scene complexity. Addressing these limitations may involve strategies such as multi-view feature fusion, resolution-adaptive training, or incorporating depth priors to strengthen both detection and regression accuracy under varying imaging conditions.\u003c/p\u003e \u003cp\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eOverall, the results suggest that the proposed ensemble framework is a promising approach for knowing potential landing sites by estimating the inclination angles of surfaces. By enabling the detection of multiple surface inclinations within a single frame, the framework provides a practical solution for UAVs to identify and land safely on inclined terrains. The experiments also demonstrate the robustness of the architecture in estimating inclination angles under varying camera view angles. Nevertheless, further improvements are necessary\u0026mdash;specifically, incorporating the camera view angle as an explicit input parameter during training could enhance overall model efficiency. Additionally, the framework may be extended to estimate other critical surface attributes, such as roughness and steepness, to support more comprehensive and reliable landing site selection.\u003c/p\u003e \u003c/div\u003e"},{"header":"6. CONCLUSION","content":"\u003cp\u003eThis work introduces the UAV-Landing Inclination Dataset (UAV-LID), comprising of a synthetic ROS-Gazebo dataset with precise ground-truth labels and a real-world aerial dataset capturing diverse surfaces and altitudes. This work also proposes an ensemble learning framework for UAV-based surface inclination estimation, which can be used for autonomous landing site selection. By unifying surface detection with inclination angle regression, the framework demonstrates efficient performance. The UAV-LID dataset addresses a critical data gap and enables robust validation and domain transfer analysis. The framework\u0026rsquo;s lightweight design and reliance on monocular vision make it practical for resource-constrained UAV platforms where conventional sensing is prohibitive. Its applicability extends beyond landing site selection to terrain analysis, surveying, and environmental monitoring. Future work will expand UAV-LID and integrate additional landing parameters, environmental dynamics, and multi-modal fusion to further enhance terrain characterization and safe UAV autonomy.\u003c/p\u003e"},{"header":"Declarations","content":" \u003cp\u003eThe research leading to these results received funding from DRDO \u0026ndash; Aeronautical Research \u0026amp; Development Board for carrying out this activity.\u003c/p\u003e \u003ch2\u003eAuthor Contribution\u003c/h2\u003e\n\u003cp\u003eIN and DK conceptualized the idea, drafted the first draft, and carried out simulation, IN, DK, NB and SP carried out field experiments, DK and NB carried out experimentation on GPUs, IN and SP proposed the methodology and carried out the design of experiments, IN and SP carried out final review, and SP provided the necessary resources and funding for carrying out this work.\u003c/p\u003e\n\u003ch2\u003eACKNOWLEDGEMENT\u003c/h2\u003e\n\u003cp\u003eThe authors would like to acknowledge the funding support received from DRDO \u0026ndash; Aeronautical Research \u0026amp; Development Board for carrying out this activity. The authors would also like to thank the Director CSIO for providing the necessary support. The authors would like to acknowledge DGCA and ATC for providing the necessary permissions for carrying out the drone flight activities.\u003c/p\u003e\n\u003ch2\u003eData Availability\u003c/h2\u003e\n\u003cp\u003eThe dataset discussed in the manuscript shall be published and available publicly at github repository once the article is published.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eGautam, A., Sujit, P.B., Saripalli, S.: A survey of autonomous landing techniques for UAVs, in \u003cem\u003einternational conference on unmanned aircraft systems (ICUAS)\u003c/em\u003e, IEEE, 2014, pp. 1210\u0026ndash;1218. Accessed: Sept. 17, 2025. [Online]. Available: (2014). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ieeexplore.ieee.org/abstract/document/6842377/\u003c/span\u003e\u003cspan address=\"https://ieeexplore.ieee.org/abstract/document/6842377/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu, L., et al.: Deep learning for vision-based micro aerial vehicle autonomous landing. Int. J. Micro Air Veh. \u003cb\u003e10\u003c/b\u003e(2), 171\u0026ndash;185 (June 2018). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1177/1756829318757470\u003c/span\u003e\u003cspan address=\"10.1177/1756829318757470\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXin, L., Tang, Z., Gai, W., Liu, H.: Vision-based autonomous landing for the UAV: A review. Aerospace. \u003cb\u003e9\u003c/b\u003e(11), 634 (2022)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBaidya, R., Jeong, H.: Simulation and real-life implementation of UAV autonomous landing system based on object recognition and tracking for safe landing in uncertain environments. Front. Rob. AI. \u003cb\u003e11\u003c/b\u003e, 1450266 (2024)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaripalli, S., Montgomery, J.F., Sukhatme, G.S.: Vision-based autonomous landing of an unmanned aerial vehicle, in \u003cem\u003eProceedings IEEE international conference on robotics and automation (Cat. No. 02CH37292)\u003c/em\u003e, IEEE, 2002, pp. 2799\u0026ndash;2804. Accessed: Sept. 17, 2025. [Online]. Available: (2002). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ieeexplore.ieee.org/abstract/document/1013656/\u003c/span\u003e\u003cspan address=\"https://ieeexplore.ieee.org/abstract/document/1013656/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang, H., Zhao, J., American Society of Mechanical Engineers, T21A004: Vision based surface slope estimation for unmanned aerial vehicle perching, in \u003cem\u003eDynamic Systems and Control Conference\u003c/em\u003e, p. V002. Accessed: Sept. 17, 2025. [Online]. (2018). Available: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://asmedigitalcollection.asme.org/DSCC/proceedings-abstract/DSCC2018/51906/270966\u003c/span\u003e\u003cspan address=\"https://asmedigitalcollection.asme.org/DSCC/proceedings-abstract/DSCC2018/51906/270966\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZHANG, Z., Quanrui, C., Qiufu, W., Xiaoliang, S.U.N., Qifeng, Y.U.: Monocular visual estimation for autonomous aircraft landing guidance in unknown structured scenes. Chin. J. Aeronaut., p. 103479, (2025)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKakaletsis, E., Nikolaidis, N.: Potential UAV Landing Sites Detection through Digital Elevation Models Analysis, July 14, \u003cem\u003earXiv\u003c/em\u003e: arXiv:2107.06921. (2021). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.48550/arXiv.2107.06921\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2107.06921\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGarcia-Pulido, J.A., Pajares, G., Dormido, S., de la Cruz, J.M.: Recognition of a landing platform for unmanned aerial vehicles by using computer vision-based techniques. Expert Syst. Appl. \u003cb\u003e76\u003c/b\u003e, 152\u0026ndash;165 (2017)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLim, J., Kim, M., Yoo, H., Lee, J.: Autonomous multirotor UAV search and landing on safe spots based on combined semantic and depth information from an onboard camera and LiDAR. IEEE/ASME Trans. Mechatron. \u003cb\u003e29\u003c/b\u003e(5), 3960\u0026ndash;3970 (2024)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLin, S., Jin, L., Chen, Z.: Real-time monocular vision system for UAV autonomous landing in outdoor low-illumination environments. Sensors. \u003cb\u003e21\u003c/b\u003e(18), 6226 (2021)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChatzikalymnios, E., Moustakas, K.: Landing Site Detection for Autonomous Rotor Wing UAVs Using Visual and Structural Information, \u003cem\u003eJ Intell Robot Syst\u003c/em\u003e, vol. 104, no. 2, p. 27, Feb. (2022). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s10846-021-01544-6\u003c/span\u003e\u003cspan address=\"10.1007/s10846-021-01544-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePark, J., Kim, Y., Kim, S.: Landing site searching and selection algorithm development using vision system and its application to quadrotor. IEEE Trans. Control Syst. Technol. \u003cb\u003e23\u003c/b\u003e(2), 488\u0026ndash;503 (2014)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMittal, M., Mohan, R., Burgard, W., Valada, A.: Vision-Based Autonomous UAV Navigation and Landing for Urban Search and Rescue, in \u003cem\u003eRobotics Research\u003c/em\u003e, vol. 20, T. Asfour, E. Yoshida, J. Park, H. Christensen, and O. Khatib, Eds., in Springer Proceedings in Advanced Robotics, vol. 20., Cham: Springer International Publishing, pp. 575\u0026ndash;592. (2022). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-3-030-95459-8_35\u003c/span\u003e\u003cspan address=\"10.1007/978-3-030-95459-8_35\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNeves, F.S., Branco, L.M., Pereira, M.I., Claro, R.M., Pinto, A.M., A multimodal learning-based approach for autonomous landing of uav, in: \u003cem\u003e20th IEEE/ASME International Conference on Mechatronic and Embedded\u003c/em\u003e Systems and Applications \u003cem\u003e(MESA)\u003c/em\u003e, IEEE, 2024, pp. 1\u0026ndash;8. Accessed: Sept. 17, 2025. [Online]. Available: (2024). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ieeexplore.ieee.org/abstract/document/10704866/\u003c/span\u003e\u003cspan address=\"https://ieeexplore.ieee.org/abstract/document/10704866/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarcu, A., Costea, D., Licaret, V., P\u0026icirc;rvu, M., Slusanschi, E., Leordeanu, M.: SafeUAV: Learning to estimate depth and safe landing areas for UAVs from synthetic data, in \u003cem\u003eProceedings of the European Conference on Computer Vision (ECCV) Workshops\u003c/em\u003e, pp. 0\u0026ndash;0. Accessed: Sept. 17, 2025. [Online]. (2018). Available: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://openaccess.thecvf.com/content_eccv_2018_workshops/w7/html/Marcu_SafeUAV_Learning_to_estimate_depth_and_safe_landing_areas_for_ECCVW_2018_paper.html\u003c/span\u003e\u003cspan address=\"https://openaccess.thecvf.com/content_eccv_2018_workshops/w7/html/Marcu_SafeUAV_Learning_to_estimate_depth_and_safe_landing_areas_for_ECCVW_2018_paper.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, in \u003cem\u003eProceedings of the IEEE/CVF conference on computer vision and pattern recognition\u003c/em\u003e, pp. 7464\u0026ndash;7475. Accessed: Sept. 17, 2025. [Online]. (2023). Available: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://openaccess.thecvf.com/content/CVPR2023/html/Wang_YOLOv7_Trainable_Bag-of-Freebies_Sets_New_State-of-the-Art_for_Real-Time_Object_Detectors_CVPR_2023_paper.html\u003c/span\u003e\u003cspan address=\"http://openaccess.thecvf.com/content/CVPR2023/html/Wang_YOLOv7_Trainable_Bag-of-Freebies_Sets_New_State-of-the-Art_for_Real-Time_Object_Detectors_CVPR_2023_paper.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKoonce, B., Network, V.G.G.: in Convolutional Neural Networks with Swift for Tensorflow, pp. 35\u0026ndash;50. A, Berkeley, CA (2021). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-1-4842-6168-2_4\u003c/span\u003e\u003cspan address=\"10.1007/978-1-4842-6168-2_4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWoo, S., et al.: Convnext v2: Co-designing and scaling convnets with masked autoencoders, in \u003cem\u003eProceedings of the IEEE/CVF conference on computer vision and pattern recognition\u003c/em\u003e, pp. 16133\u0026ndash;16142. Accessed: Sept. 18, 2025. [Online]. (2023). Available: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://openaccess.thecvf.com/content/CVPR2023/html/Woo_ConvNeXt_V2_Co-Designing_and_Scaling_ConvNets_With_Masked_Autoencoders_CVPR_2023_paper.html\u003c/span\u003e\u003cspan address=\"http://openaccess.thecvf.com/content/CVPR2023/html/Woo_ConvNeXt_V2_Co-Designing_and_Scaling_ConvNets_With_Masked_Autoencoders_CVPR_2023_paper.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKoonce, B.: EfficientNet. In: Convolutional Neural Networks with Swift for Tensorflow, pp. 109\u0026ndash;123. A, Berkeley, CA (2021). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-1-4842-6168-2_10\u003c/span\u003e\u003cspan address=\"10.1007/978-1-4842-6168-2_10\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShang, J., Zhang, K., Zhang, Z., Li, C., Liu, H.: A high-performance convolution block oriented accelerator for MBConv-Based CNNs. Integration. \u003cb\u003e88\u003c/b\u003e, 298\u0026ndash;312 (2023)\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"international-journal-of-intelligent-robotics-and-applications","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jira","sideBox":"Learn more about [International Journal of Intelligent Robotics and Applications](https://www.springer.com/journal/41315)","snPcode":"41315","submissionUrl":"https://submission.springernature.com/new-submission/41315/3","title":"International Journal of Intelligent Robotics and Applications","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"UAV landing, inclination, deep learning, detection, ConvNext, EfficientNet","lastPublishedDoi":"10.21203/rs.3.rs-8472654/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8472654/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eUnmanned Aerial Vehicles (UAVs) has emerged as a transformative tool for 3D reconstruction, offering diverse applications in urban planning, infrastructure monitoring, and emergency response. This work introduces a combination of synthetic and real-world visual image dataset for estimating inclination of surfaces from UAV and is termed as UAV-Landing Inclination Dataset (UAV-LID). This work also proposes an ensemble deep learning architecture that carries out detection of possible landing surfaces and their inclination angle estimation. The surface detection architecture uses YOLOv7 module for surface detection while the inclination angle estimator uses different backbone architectures to estimate inclination. The dataset consists of visual images of different kinds of possible surfaces at different heights and inclination angles. Different backbones such as VGG16, EfficientNet, and ConvNext based architectures have been experimented here for the task of inclination estimation, of which the EfficientNet based architecture shows promising performance. Experimental results show that deep learning-based networks can be used effectively for this purpose and in future, can be extended for landing of UAV on slanted surfaces directly.\u003c/p\u003e","manuscriptTitle":"UAV-Landing Inclination Dataset: Enabling Inclination-Aware Surface Detection from UAV","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-14 09:46:35","doi":"10.21203/rs.3.rs-8472654/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-05-13T18:39:10+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-09T22:06:45+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"77226599103692975527275128444404151809","date":"2026-04-12T17:58:40+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"51814207903242651588263818716730523784","date":"2026-04-11T15:12:35+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"308111315733639457798453617406155518290","date":"2026-04-07T08:03:56+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-04-07T07:39:57+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-03-25T08:39:18+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-29T11:53:34+00:00","index":"","fulltext":""},{"type":"submitted","content":"International Journal of Intelligent Robotics and Applications","date":"2025-12-29T11:31:24+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"international-journal-of-intelligent-robotics-and-applications","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jira","sideBox":"Learn more about [International Journal of Intelligent Robotics and Applications](https://www.springer.com/journal/41315)","snPcode":"41315","submissionUrl":"https://submission.springernature.com/new-submission/41315/3","title":"International Journal of Intelligent Robotics and Applications","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"8136bbcc-14b8-42f6-866d-f474362751e4","owner":[],"postedDate":"April 14th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-05-13T18:39:10+00:00","index":55,"fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-09T22:06:45+00:00","index":54,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-04-14T09:46:35+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-14 09:46:35","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8472654","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8472654","identity":"rs-8472654","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.