I want to train an autoencoder on mp3 songs. Given the size of the dataset, it would be better if only part of the dataset is in memory at any given time.
What I tried is using tfio and tf.data.Dataset but that gives me an error when fitting the model.
ValueError: Cannot iterate over a shape with unknown rank.
The code was as follows
segment_length = 1024
filenames= tf.data.Dataset.list_files('data/*')
def decode_mp3(mp3_path):
mp3_path = mp3_path.numpy().decode("utf-8")
audio = tfio.audio.AudioIOTensor(mp3_path)
audio_tensor = tf.cast(audio[:], tf.float32)
overflow = len(audio_tensor) % segment_length
audio_tensor = audio_tensor[:-overflow, 0]
audio_tensor = tf.reshape(audio_tensor,(len(audio_tensor), 1))
audio_tensor = audio_tensor[:, 0]
return audio_tensor
song_dataset = filenames.map(lambda path:
tf.py_function(func=decode_mp3, inp=[path], Tout=tf.float32))
segment_dataset = song_dataset.flat_map(lambda song:
tf.data.Dataset.from_tensor_slices(song)).batch(segment_length)
dataset = segment_dataset.map(lambda x: (x, x)) # add labels (identical to inputs here)
With a model like so
encoder = keras.models.Sequential([
keras.layers.Input((segment_length, 1)),
keras.layers.Conv1D(128, 3, strides=2, padding="same"),
...
)]
but as I said, calling fit would throw the error above. Even though the shape is exactly as I would hope
for x,y in dataset.take(1):
print(x.shape, y.shape)
> (1024, 1) (1024, 1)
Any help on this would be appreciated. I might be misunderstanding something with input shapes and datasets.
CodePudding user response:
So I finally found part of the answer. The Input layer seems to be meant for models with the functional API (?) and I removed it. Now the model is like this
encoder = keras.models.Sequential([
keras.layers.Conv1D(128, 3, strides=2, padding="same", input_shape=(segment_length, 1)),
...
where the Input layer is replaced with an input_shape parameter in the first Conv1D layer. Also I batched the dataset with
ds = dataset.batch(2)
and that was important too. Any further clarification would still be appreciated. None the less, I hope this can help people with the same problem.
