Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle multiple images + new output options #52

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,30 +13,30 @@ By [Iro Laina](http://campar.in.tum.de/Main/IroLaina), [Christian Rupprecht](htt

## Introduction

This repository contains the CNN models trained for depth prediction from a single RGB image, as described in the paper "[Deeper Depth Prediction with Fully Convolutional Residual Networks](https://arxiv.org/abs/1606.00373)". The provided models are those that were used to obtain the results reported in the paper on the benchmark datasets NYU Depth v2 and Make3D for indoor and outdoor scenes respectively. Moreover, the provided code can be used for inference on arbitrary images.
This repository contains the CNN models trained for depth prediction from a single RGB image, as described in the paper "[Deeper Depth Prediction with Fully Convolutional Residual Networks](https://arxiv.org/abs/1606.00373)". The provided models are those that were used to obtain the results reported in the paper on the benchmark datasets NYU Depth v2 and Make3D for indoor and outdoor scenes respectively. Moreover, the provided code can be used for inference on arbitrary images.


## Quick Guide

The trained models are currently provided in two frameworks, MatConvNet and TensorFlow. Please read below for more information on how to get started.

### TensorFlow
The code provided in the *tensorflow* folder requires accordingly a successful installation of the [TensorFlow](https://www.tensorflow.org/) library (any platform).
The model's graph is constructed in ```fcrn.py``` and the corresponding weights can be downloaded using the link below. The implementation is based on [ethereon's](https://github.com/ethereon/caffe-tensorflow) Caffe-to-TensorFlow conversion tool.
```predict.py``` provides sample code for using the network to predict the depth map of an input image. Use ```python predict.py NYU_FCRN.ckpt yourimage.jpg``` to try the code.
The code provided in the *tensorflow* folder requires accordingly a successful installation of the [TensorFlow](https://www.tensorflow.org/) library (any platform).
The model's graph is constructed in ```fcrn.py``` and the corresponding weights can be downloaded using the link below. The implementation is based on [ethereon's](https://github.com/ethereon/caffe-tensorflow) Caffe-to-TensorFlow conversion tool.
```predict.py``` provides sample code for using the network to predict the depth map of an input image. Use ```python predict.py -p NYU_FCRN.ckpt yourimage.jpg``` to try the code.

### MatConvNet

**Prerequisites**

The code provided in the *matlab* folder requires the [MatConvNet toolbox](http://www.vlfeat.org/matconvnet/) for CNNs. It is required that a version of the library equal or newer than the 1.0-beta20 is successfully compiled either with or without GPU support.
Furthermore, the user should modify ``` matconvnet_path = '../matconvnet-1.0-beta20' ``` within `evaluateNYU.m` and `evaluateMake3D.m` so that it points to the correct path, where the library is stored.
The code provided in the *matlab* folder requires the [MatConvNet toolbox](http://www.vlfeat.org/matconvnet/) for CNNs. It is required that a version of the library equal or newer than the 1.0-beta20 is successfully compiled either with or without GPU support.
Furthermore, the user should modify ``` matconvnet_path = '../matconvnet-1.0-beta20' ``` within `evaluateNYU.m` and `evaluateMake3D.m` so that it points to the correct path, where the library is stored.

**How-to**
**How-to**

For acquiring the predicted depth maps and evaluation on NYU or Make3D *test sets*, the user can simply run `evaluateNYU.m` or `evaluateMake3D.m` respectively. Please note that all required data and models will be then automatically downloaded (if they do not already exist) and no further user intervention is needed, except for setting the options `opts` and `netOpts` as preferred. Make sure that you have enough free disk space (up to 5 GB). The predictions will be eventually saved in a .mat file in the specified directory.

Alternatively, one could run `DepthMapPrediction.m` in order to manually use a trained model in test mode to predict the depth maps of arbitrary images.
Alternatively, one could run `DepthMapPrediction.m` in order to manually use a trained model in test mode to predict the depth maps of arbitrary images.

## Models

Expand All @@ -50,17 +50,17 @@ The trained models - namely **ResNet-UpProj** in the paper - can also be downloa

## Results

**NEW!** The predictions for the validation set of NYU-Depth-v2 dataset can also be downloaded [here](http://campar.in.tum.de/files/rupprecht/depthpred/predictions_NYUval.mat) (.mat).
**NEW!** The predictions for the validation set of NYU-Depth-v2 dataset can also be downloaded [here](http://campar.in.tum.de/files/rupprecht/depthpred/predictions_NYUval.mat) (.mat).

In the following tables, we report the results that should be obtained after evaluation and also compare to other (most recent) methods on depth prediction from a single image.
In the following tables, we report the results that should be obtained after evaluation and also compare to other (most recent) methods on depth prediction from a single image.
- Error metrics on NYU Depth v2:

| State of the art on NYU | rel | rms | log10 |
|-----------------------------|:-----:|:-----:|:-----:|
| [Roy & Todorovic](http://web.engr.oregonstate.edu/~sinisa/research/publications/cvpr16_NRF.pdf) (_CVPR 2016_) | 0.187 | 0.744 | 0.078 |
| [Eigen & Fergus](http://cs.nyu.edu/~deigen/dnl/) (_ICCV 2015_) | 0.158 | 0.641 | - |
| **Ours** | **0.127** | **0.573** | **0.055** |

- Error metrics on Make3D:

| State of the art on Make3D | rel | rms | log10 |
Expand Down
98 changes: 65 additions & 33 deletions tensorflow/predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,70 +4,102 @@
import tensorflow as tf
from matplotlib import pyplot as plt
from PIL import Image
from glob import glob
import scipy.io as scio

import models

def predict(model_data_path, image_path):
def predict(model_data_path, image_path, output_folder, show_plot=True, output_format="npy"):


# Setup input and output
images = sorted(glob(image_path))
if output_folder and not os.path.isdir(output_folder):
os.mkdir(output_folder)
o_formats = output_format.split(',')


# Default input size
height = 228
width = 304
channels = 3
batch_size = 1

# Read image
img = Image.open(image_path)
img = img.resize([width,height], Image.ANTIALIAS)
img = np.array(img).astype('float32')
img = np.expand_dims(np.asarray(img), axis = 0)


# Create a placeholder for the input image
input_node = tf.placeholder(tf.float32, shape=(None, height, width, channels))

# Construct the network
net = models.ResNet50UpProj({'data': input_node}, batch_size, 1, False)

with tf.Session() as sess:

# Load the converted parameters
print('Loading the model')

# Use to load from ckpt file
saver = tf.train.Saver()
saver = tf.train.Saver()
saver.restore(sess, model_data_path)

# Use to load from npy file
#net.load(model_data_path, sess)

# Evalute the network for the given image
pred = sess.run(net.get_output(), feed_dict={input_node: img})

# Plot result
fig = plt.figure()
ii = plt.imshow(pred[0,:,:,0], interpolation='nearest')
fig.colorbar(ii)
plt.show()

return pred


#net.load(model_data_path, sess)

for i, img_path in enumerate(images):
print("Processing image {} / {}".format(i+1,len(images)))

# Read image
img = Image.open(img_path)
img = img.resize([width,height], Image.ANTIALIAS)
img = np.array(img).astype('float32')
img = np.expand_dims(np.asarray(img), axis = 0)

# Evalute the network for the given image
pred = sess.run(net.get_output(), feed_dict={input_node: img})
pred = np.squeeze(pred)

# Write result
if output_folder:
input_filename, ext = os.path.splitext(os.path.basename(img_path))
filename = os.path.join(output_folder, "{}_depth".format(input_filename))

if "mat" in o_formats:
scio.savemat(filename,
{'depth': pred},
do_compression=True)

if "npy" in o_formats:
np.save(filename, pred)

if "img" in o_formats:
fig = plt.figure()
ii = plt.imshow(pred, interpolation='nearest')
fig.colorbar(ii)
plt.savefig(filename)
plt.close(fig)

# Plot result
if show_plot:
fig = plt.figure()
ii = plt.imshow(pred, interpolation='nearest')
fig.colorbar(ii)
plt.show()


def main():
# Parse arguments
parser = argparse.ArgumentParser()
parser.add_argument('model_path', help='Converted parameters for the model')
parser.add_argument('image_paths', help='Directory of images to predict')
parser.add_argument('image_path', help='Images to predict, can be glob pattern')
parser.add_argument('-o', '--output_folder', type=str, default=None,
help='Path to output depth maps')
parser.add_argument('-f', '--output_format', type=str, default="npy",
help='output format as comma separated list, can be img, npy or mat')
parser.add_argument('-p', '--plot', type=bool, default=False,
help="Plot output image with colorbar on the screen")
args = parser.parse_args()

# Predict the image
pred = predict(args.model_path, args.image_paths)
predict(args.model_path, args.image_path, args.output_folder, args.plot, args.output_format)

os._exit(0)

if __name__ == '__main__':
main()