python - pandas read_hdf file signature not found unable

Error opening file in H5PY(File signature not found) (2)

I've been using the following bit of code to open some HDF5 files, produced in MATLAB, in python using H5PY:

import h5py as h5
f=h5.File(data, 'r')

However I'm getting the following error:

OSError: Unable to open file (File signature not found)

I've checked that the files that I'm trying to open are version 7.3 MAT-files and are HDF5 format. In fact I've used H5PY to open the same files successfully before. I've confirmed that the files exist and are accessible so I'm not really sure where the error is coming from. Any advice would be greatly appreciated, thanks in advance : )

I was facing the same issue with my .h5 file. And the problem was that I was not downloading the .h5 file correctly.

I was doing filename.h5->right_click->save link as, which was not downloading the file correctly(or may be the file was getting corrupted). Instead of doing that I downloaded the file as : selected the checkbox with filename.h5 and clicked on download and after that my code worked.

May be this help the one's who are doing the same mistake.

Usually the message File signature not found indicates either:

1. Your file is corrupted.

... is what I think is most likely. You said you've opened the files before. Maybe you forgot closing your file-handle which can corrupt the file. Try checking the file with the HDF5 utility h5debug (available on command line if you've installed the hdf5 lib on your OS, check with dpkg -s libhdf5-dev on Linux).

2. The file is not in HDF5 format.

This is a known cause for your error message. But since you said you made sure, that this is the case and you've opened the files before, I'm giving this just for reference for others that may stumble here:

Since December 2015 (as of version 7.3), Matlab files use the HDF5 based format in their MAT-File Level 5 Containers (more doc). Earlier version MAT-files (v4 (Level 1.0), v6 and v7 to 7.2) are supported by and can be read with the scipy library:

f ='dataset.mat')

Otherwise you may try other methods and see whether the error persists:

PyTables is an alternative to h5py and be found here.

import tables
file = tables.openFile('test.mat')

Python MATLAB Engine is an alternative to read MAT files, if you have matlab installed. Documentation is found here: MATLAB Engine API for Python.

import matlab.engine
mat = matlab.engine.start_matlab()
f = mat.load("dataset.mat", nargout=1)