Research and industry proceed to build autonomous driving cars, self-navigating unmanned aerialvehicles and intelligent, mobile robots. Furthermore, due to the demographic change, intelligentassistance devices for visually impaired and elderly people are in demand. For these tasks systemsare needed which reliably map the 3D surroundings and are able to self-localize in these createdmaps. Such systems or methods can be summarized under the terms simultaneous localizationand mapping (SLAM) or visual odometry (VO). The SLAM problem was broadly studied fordifferent visual sensors like monocular, stereo and RGB-D cameras. During the last decadeplenoptic cameras (or light field cameras) have become available as commercial products. Theycapture 4D light field information in a single image. This light field information can be used, forinstance, to retrieve 3D structure. This dissertation deals with the task of VO for plenoptic cameras. For this purpose, a newmodel for micro lens array (MLA) based light field cameras was developed. On the basis of thismodel a multiple view geometry (MVG) for MLA based light field cameras is derived. An efficient probabilistic depth estimation approach for MLA based light field cameras is presented. The method establishes semi-dense depth maps directly from the micro images recordedby the camera. Multiple depth observations are merged in a probabilistic depth hypotheses.Disparity uncertainties resulting e.g. from differently focused micro images and sensor noise aretaken into account. This algorithm is initially developed on the basis of single light field imagesand later extended by the introduced MVG. Furthermore, calibration approaches for focused plenoptic cameras at different levels of com-plexity are presented. We begin with depth conversion functions, which convert the virtual depthestimated by the camera into metric distances, and then proceed to derive a plenoptic cameramodel on the basis of the estimated virtual depth map and the corresponding synthesized, totallyfocused image. This model takes into consideration depth distortion and a sensor which is tilted inrelation to the main lens. Finally it leads to a plenoptic camera model which defines the completeprojection of an object point to multiple micro images on the sensor. Based on this model weemphasize the importance of modeling squinting micro lenses. The depth conversion functions areestimated based on a series of range measurements while the parameters of all other models areestimated in a bundle adjustment using a 3D calibration target. The bundle adjustment basedmethods significantly outperform existing approaches. The plenoptic camera based MVG, the depth estimation approach, and the model obtainedfrom the calibration are combined in a VO algorithm called Direct Plenoptic Odometry (DPO).DPO works directly on the recorded micro images. Therefore, it does not have to deal withaliasing effects in the spatial domain. The algorithm generates semi-dense 3D point clouds on thebasis of correspondences in subsequent light field frames. A scale optimization framework is usedto adjust scale drifts and wrong absolute scale initializations. To the best of our knowledge, it isthe first method that performs tracking and mapping for plenoptic cameras directly on the microimages. DPO is tested on a variety of indoor and outdoor sequences. With respect to the scaledrift, it outperforms state-of-the-art monocular VO and SLAM algorithms. Regarding absoluteaccuracy, DPO is competitive to existing monocular and stereo approaches.