Scale is an important parameter of images. Different objects or image structures (e.g. edges and corners) can appear at different scales and each is meaningful only over a limited range of scales. Multi-scale analysis has been widely used in image processing and computer vision, serving as the basis for many high-level image analysis systems. One such high-level system is based on supervised learning as studied in pattern recognition and machine learning, which might take the results from multi-scale analysis as its input. Supervised learning defines a classifier to assign objects into different categories, and learns the classifier with some example objects whose category labels are known. A common characteristic of the current multi-scale analysis methods, however, is that they are designed without specific assumptions about the high-level image analysis systems. The problem is that, different tasks need images to be analysed at different scales, that is, they need different multi-scale analysis. For example, for the same image containing a person, small scales are needed if the problem is to segment the eyes, while large scales are needed when one wants to segment the person. In many applications, the task is defined only with some given example images and it is not known a priori the right scale to conduct analysis. This asks for multi-scale analysis frameworks which can adapt to the different tasks. The aim of this thesis is to study such adaptive multi-scale frameworks based on supervised learning. It focuses on three important aspects in multi-scale analysis: scale selection, scale invariance, and scale combining. Scale selection addresses the problem of choosing a right scale to detect an object or to analyse an image. Scale invariance is the ability to deal with objects appearing at arbitrary sizes. Scale combining concerns the combination of information from all scales. General learning frameworks are proposed for these three aspects. Examples are shown for image segmentation and classification problems. A learning-based scale selection method is proposed for supervised image segmentation. Supervised segmentation trains a classifier based on some given segmented images, which assigns the pixels of an image into different classes or segments. The input of the classifier is features extracted from a neighbourhood at each pixel, and the scale of this neighbourhood is a crucial parameter of the features. Scale is usually selected as the size of a certain image structure, which is, however, not necessarily the best for the segmentation task. Keeping this in mind, the selected scale for supervised segmentation is redefined as the one at which pixels from different classes are best separable. A general scale selection scheme is proposed, which relies on the classifier for segmentation to measure the class separability. Experiments are presented, which show that this scheme can indeed choose scales that are best for the segmentation problem and thus leads to significantly improved performance. Based on the proposed scale selection scheme, a scale-invariant classification framework is proposed for supervised image segmentation. This classifier can deal with images from arbitrary scales. Consequently, the same segmentation result will be obtained when an image is resized. The classifier is trained with image features from all scales, and thus able to handle images from any scales. To make the classifier not biased on particular scales, the right proportion of features from different scales is needed. Scale invariance of the classification is achieved with the proposed scale selection scheme in the testing phase, which finds the right scales for image structures of different sizes. A learning model closely related to the proposed scale-invariant classification is multiple-instance learning (MIL). MIL is a generalised supervised-learning framework that represents an object as a bag consisting of many feature vectors called instances. Only some of the instances in the bag are informative about the label of the object, while others share the same probability distribution for objects from different classes. In the training phase, only the labels of bags (not instances) are known, and a classifier is trained to separate bags into different classes. These characteristics make MIL fit well for multi-scale image analysis, as an object can be represented with a set of features from all scales and only features from some scales are informative. Features from other scales are uninformative as the object becomes too blurred or too small to be distinguished from other objects. Observing that MIL algorithms usually make effective use of only one, not all, informative instance in a bag, we propose a new MIL model to. A simple MIL classifier is obtained, which performs very well for numerous data sets in the experiments. Combining information from multiple scales is studied based on the dissimilarity representation. It has been recognised that information from more than one scale can be useful for image analysis and should be exploited for better performance. For learning-based image analysis, multi-scale information is usually combined by concatenating features from all scales, which typically creates an enormously high-dimensional feature vector and thus makes learning difficult. We use the dissimilarity representation as it enables to combine multi-scale information without increasing the dimensionality of the representation space. It represents an image with dissimilarities by comparing it with a set of reference images. Multi-scale information is exploited by computing dissimilarities at each scale and then combining these dissimilarities. Various rules are proposed and tested with real-world image classification problems. The results show that simple combining rules can already improve significantly upon the best result from the individual scales, and more adaptive rules, which exploit certain structures along the scale, can lead to even better results.