MayaFlux 0.3.0
Digital-First Multimedia Processing Framework
Loading...
Searching...
No Matches
SoundFileReader.hpp
Go to the documentation of this file.
1#pragma once
2
3#include "FileReader.hpp"
4
7
9
10namespace MayaFlux::IO {
11
12/**
13 * @enum AudioReadOptions
14 * @brief Audio-specific reading options
15 */
16enum class AudioReadOptions : uint32_t {
17 NONE = 0,
18 NORMALIZE = 1 << 0, ///< Not implemented — placeholder for future volume filter.
19 CONVERT_TO_MONO = 1 << 2, ///< Not implemented — placeholder for channel mixer.
20 DEINTERLEAVE = 1 << 3, ///< Output planar (per-channel) doubles instead of interleaved.
21 ALL = 0xFFFFFFFF
22};
23
25{
26 return static_cast<AudioReadOptions>(static_cast<uint32_t>(a) | static_cast<uint32_t>(b));
27}
28
30{
31 return static_cast<AudioReadOptions>(static_cast<uint32_t>(a) & static_cast<uint32_t>(b));
32}
33
34/**
35 * @class SoundFileReader
36 * @brief FFmpeg-based audio file reader for MayaFlux
37 *
38 * SoundFileReader provides a high-level interface for reading and decoding audio files using FFmpeg.
39 * It supports a wide range of audio formats, automatic sample format conversion to double precision,
40 * resampling, metadata extraction, region/marker extraction, and streaming/seekable access.
41 *
42 * Key Features:
43 * - Format detection and demuxing via libavformat
44 * - Audio decoding via libavcodec
45 * - Sample format conversion and resampling via libswresample (always outputs double)
46 * - Metadata and region extraction from FFmpeg's parsed structures
47 * - Seeking and timestamp handling via FFmpeg's APIs
48 * - Automatic creation and population of Kakshya::SoundFileContainer for downstream processing
49 * - Thread-safe access for reading and metadata queries
50 *
51 * Usage:
52 * SoundFileReader reader;
53 * if (reader.open("file.wav")) {
54 * auto metadata = reader.get_metadata();
55 * auto all_data = reader.read_all();
56 * auto container = reader.create_container();
57 * // ...
58 * reader.close();
59 * }
60 *
61 * All audio data is converted to double precision for internal processing.
62 * The reader can output data in either interleaved or deinterleaved (planar) layout.
63 */
65public:
66 /**
67 * @brief Construct a new SoundFileReader object.
68 * Initializes internal state and prepares for file operations.
69 */
71
72 /**
73 * @brief Destroy the SoundFileReader object.
74 * Cleans up FFmpeg resources and internal state.
75 */
76 ~SoundFileReader() override;
77
78 /**
79 * @brief Check if this reader can open the given file.
80 * @param filepath Path to the file.
81 * @return True if the file can be read, false otherwise.
82 */
83 bool can_read(const std::string& filepath) const override;
84
85 /**
86 * @brief Open an audio file for reading.
87 * @param filepath Path to the file.
88 * @param options File read options.
89 * @return True if the file was opened successfully.
90 */
91 bool open(const std::string& filepath, FileReadOptions options = FileReadOptions::ALL) override;
92
93 /**
94 * @brief Open an audio stream from an already-constructed demux and stream context.
95 *
96 * Secondary open path for callers that have already probed the demuxer and
97 * opened an AudioStreamContext (e.g. VideoFileReader sharing its m_demux and
98 * m_audio). No avformat_open_input or AudioStreamContext::open is performed —
99 * both contexts are adopted as-is.
100 *
101 * The filepath is required only for filesystem-level metadata (file_size,
102 * modification_time) populated during build_metadata. It must point to the
103 * same file already open in the demux context. Pass an empty string only if
104 * FileReadOptions::EXTRACT_METADATA is not set.
105 *
106 * All subsequent SoundFileReader operations — metadata, regions, read_all,
107 * load_into_container — behave identically to the filepath-based open().
108 *
109 * Both contexts must remain valid for the lifetime of this reader.
110 *
111 * @param demux Shared, already-open demux context.
112 * @param audio Shared, already-open and valid audio stream context.
113 * @param filepath Path to the source file (used for filesystem metadata only).
114 * @param options File read options (metadata, regions, etc.).
115 * @return True if both contexts are valid and setup succeeded.
116 */
117 bool open_from_demux(std::shared_ptr<FFmpegDemuxContext> demux,
118 std::shared_ptr<AudioStreamContext> audio,
119 const std::string& filepath,
121
122 /**
123 * @brief Close the currently open file and release resources.
124 */
125 void close() override;
126
127 /**
128 * @brief Check if a file is currently open.
129 * @return True if a file is open, false otherwise.
130 */
131 bool is_open() const override;
132
133 /**
134 * @brief Get metadata for the currently open file.
135 * @return Optional FileMetadata structure.
136 */
137 std::optional<FileMetadata> get_metadata() const override;
138
139 /**
140 * @brief Get all regions (markers, loops, etc.) from the file.
141 * @return Vector of FileRegion structures.
142 */
143 std::vector<FileRegion> get_regions() const override;
144
145 /**
146 * @brief Read the entire audio file into memory.
147 * @return DataVariant containing audio data as std::vector<double>.
148 */
149 std::vector<Kakshya::DataVariant> read_all() override;
150
151 /**
152 * @brief Read a specific region from the file.
153 * @param region Region to read.
154 * @return DataVariant containing region data.
155 */
156 std::vector<Kakshya::DataVariant> read_region(const FileRegion& region) override;
157
158 /**
159 * @brief Create a SignalSourceContainer for this file.
160 * @return Shared pointer to a new SignalSourceContainer.
161 */
162 std::shared_ptr<Kakshya::SignalSourceContainer> create_container() override;
163
164 /**
165 * @brief Load file data into an existing SignalSourceContainer.
166 * @param container Target container.
167 * @return True if loading succeeded.
168 */
169 bool load_into_container(std::shared_ptr<Kakshya::SignalSourceContainer> container) override;
170
171 /**
172 * @brief Get the current read position in the file.
173 * @return Vector of dimension indices (e.g., frame index).
174 */
175 std::vector<uint64_t> get_read_position() const override;
176
177 /**
178 * @brief Seek to a specific position in the file.
179 * @param position Vector of dimension indices.
180 * @return True if seek succeeded.
181 */
182 bool seek(const std::vector<uint64_t>& position) override;
183
184 /**
185 * @brief Get supported file extensions for this reader.
186 * @return Vector of supported extensions (e.g., "wav", "flac").
187 */
188 std::vector<std::string> get_supported_extensions() const override;
189
190 /**
191 * @brief Get the C++ type of the data returned by this reader.
192 * @return Type index for std::vector<double>.
193 */
194 std::type_index get_data_type() const override { return typeid(std::vector<double>); }
195
196 /**
197 * @brief Get the C++ type of the container returned by this reader.
198 * @return Type index for Kakshya::SoundFileContainer.
199 */
200 std::type_index get_container_type() const override { return typeid(Kakshya::SoundFileContainer); }
201
202 /**
203 * @brief Get the last error message encountered by the reader.
204 * @return Error string.
205 */
206 std::string get_last_error() const override;
207
208 /**
209 * @brief Check if the reader supports streaming access.
210 * @return True if streaming is supported.
211 */
212 bool supports_streaming() const override { return true; }
213
214 /**
215 * @brief Get the preferred chunk size for streaming reads.
216 * @return Preferred chunk size in frames, typically 4096.
217 */
218 uint64_t get_preferred_chunk_size() const override { return 4096; }
219
220 /**
221 * @brief Get the number of dimensions in the audio data (typically 2: time, channel).
222 * @return Number of dimensions.
223 */
224 size_t get_num_dimensions() const override;
225
226 /**
227 * @brief Get the size of each dimension (e.g., frames, channels).
228 * @return Vector of dimension sizes.
229 */
230 std::vector<uint64_t> get_dimension_sizes() const override;
231
232 /**
233 * @brief Read a specific number of frames from the file.
234 * @param num_frames Number of frames to read.
235 * @param offset Frame offset from beginning.
236 * @return DataVariant containing std::vector<double>.
237 */
238 std::vector<Kakshya::DataVariant> read_frames(uint64_t num_frames, uint64_t offset = 0);
239
240 /**
241 * @brief Set audio-specific read options.
242 * @param options Audio read options (e.g., DEINTERLEAVE).
243 */
245
246 /**
247 * @brief Set the target sample rate for resampling.
248 * @param sample_rate Target sample rate (0 = no resampling).
249 */
250 void set_target_sample_rate(uint32_t sample_rate) { m_target_sample_rate = sample_rate; }
251
252private:
253 // =========================================================================
254 // Contexts (composition — Option B)
255 // =========================================================================
256
257 std::shared_ptr<FFmpegDemuxContext> m_demux; ///< Container / format state.
258 std::shared_ptr<AudioStreamContext> m_audio; ///< Codec + resampler state.
259
260 mutable std::shared_mutex m_context_mutex; ///< Guards both context pointers.
261
262 // =========================================================================
263 // Reader state
264 // =========================================================================
265
266 /**
267 * @brief Path to the currently open file.
268 */
269 std::string m_filepath;
270
271 /**
272 * @brief File read options used for this session.
273 */
275
276 /**
277 * @brief Audio-specific read options.
278 */
280
281 /**
282 * @brief Last error message encountered.
283 */
284 mutable std::string m_last_error;
285
286 /**
287 * @brief Mutex for thread-safe error message access.
288 */
289 mutable std::mutex m_error_mutex;
290
291 /**
292 * @brief Cached file metadata.
293 */
294 mutable std::optional<FileMetadata> m_cached_metadata;
295
296 /**
297 * @brief Cached file regions (markers, loops, etc.).
298 */
299 mutable std::vector<FileRegion> m_cached_regions;
300
301 /**
302 * @brief Current frame position for reading.
303 */
304 std::atomic<uint64_t> m_current_frame_position { 0 };
305
306 /**
307 * @brief Target sample rate for resampling (0 = use source rate).
308 */
310
311 /**
312 * @brief Mutex for thread-safe metadata access.
313 */
314 mutable std::mutex m_metadata_mutex;
315
316 // =========================================================================
317 // Internal helpers
318 // =========================================================================
319 /**
320 * @brief Decode num_frames PCM frames starting at offset.
321 * @param ctx FFmpeg context.
322 * @param num_frames Number of frames to decode.
323 * @param offset Frame offset from beginning.
324 * @return DataVariant containing decoded data.
325 *
326 * Caller must hold at least a shared lock on m_context_mutex.
327 */
328 std::vector<Kakshya::DataVariant> decode_frames(
329 const std::shared_ptr<FFmpegDemuxContext>& demux,
330 const std::shared_ptr<AudioStreamContext>& audio,
331 uint64_t num_frames,
332 uint64_t offset);
333
334 /**
335 * @brief Seek the demuxer and flush the codec to the given frame position.
336 * @param ctx FFmpeg context.
337 * @param frame_position Target frame position.
338 * @return True if seek succeeded.
339 *
340 * Caller must hold an exclusive lock on m_context_mutex.
341 */
342 bool seek_internal(const std::shared_ptr<FFmpegDemuxContext>& demux,
343 const std::shared_ptr<AudioStreamContext>& audio,
344 uint64_t frame_position);
345
346 /**
347 * @brief Build and cache FileMetadata from both contexts.
348 */
349 void build_metadata(const std::shared_ptr<FFmpegDemuxContext>& demux,
350 const std::shared_ptr<AudioStreamContext>& audio) const;
351
352 /**
353 * @brief Build and cache FileRegion list from both contexts.
354 */
355 void build_regions(const std::shared_ptr<FFmpegDemuxContext>& demux,
356 const std::shared_ptr<AudioStreamContext>& audio) const;
357
358 /**
359 * @brief Set the last error message.
360 * @param error Error string.
361 */
362 void set_error(const std::string& error) const;
363
364 /**
365 * @brief Clear the last error message.
366 */
367 void clear_error() const;
368};
369
370} // namespace MayaFlux::IO
size_t a
size_t b
Abstract interface for reading various file formats into containers.
std::vector< Kakshya::DataVariant > read_all() override
Read the entire audio file into memory.
void close() override
Close the currently open file and release resources.
void build_regions(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio) const
Build and cache FileRegion list from both contexts.
std::string get_last_error() const override
Get the last error message encountered by the reader.
uint64_t get_preferred_chunk_size() const override
Get the preferred chunk size for streaming reads.
uint32_t m_target_sample_rate
Target sample rate for resampling (0 = use source rate).
bool supports_streaming() const override
Check if the reader supports streaming access.
bool open(const std::string &filepath, FileReadOptions options=FileReadOptions::ALL) override
Open an audio file for reading.
std::mutex m_error_mutex
Mutex for thread-safe error message access.
void build_metadata(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio) const
Build and cache FileMetadata from both contexts.
std::shared_ptr< AudioStreamContext > m_audio
Codec + resampler state.
bool load_into_container(std::shared_ptr< Kakshya::SignalSourceContainer > container) override
Load file data into an existing SignalSourceContainer.
std::shared_mutex m_context_mutex
Guards both context pointers.
bool seek_internal(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio, uint64_t frame_position)
Seek the demuxer and flush the codec to the given frame position.
bool can_read(const std::string &filepath) const override
Check if this reader can open the given file.
std::mutex m_metadata_mutex
Mutex for thread-safe metadata access.
std::vector< Kakshya::DataVariant > decode_frames(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio, uint64_t num_frames, uint64_t offset)
Decode num_frames PCM frames starting at offset.
std::string m_last_error
Last error message encountered.
std::vector< uint64_t > get_read_position() const override
Get the current read position in the file.
void set_error(const std::string &error) const
Set the last error message.
std::atomic< uint64_t > m_current_frame_position
Current frame position for reading.
std::type_index get_container_type() const override
Get the C++ type of the container returned by this reader.
std::vector< Kakshya::DataVariant > read_region(const FileRegion &region) override
Read a specific region from the file.
std::vector< Kakshya::DataVariant > read_frames(uint64_t num_frames, uint64_t offset=0)
Read a specific number of frames from the file.
AudioReadOptions m_audio_options
Audio-specific read options.
std::optional< FileMetadata > m_cached_metadata
Cached file metadata.
bool open_from_demux(std::shared_ptr< FFmpegDemuxContext > demux, std::shared_ptr< AudioStreamContext > audio, const std::string &filepath, FileReadOptions options=FileReadOptions::ALL)
Open an audio stream from an already-constructed demux and stream context.
std::vector< FileRegion > m_cached_regions
Cached file regions (markers, loops, etc.).
void clear_error() const
Clear the last error message.
~SoundFileReader() override
Destroy the SoundFileReader object.
std::type_index get_data_type() const override
Get the C++ type of the data returned by this reader.
std::string m_filepath
Path to the currently open file.
std::vector< uint64_t > get_dimension_sizes() const override
Get the size of each dimension (e.g., frames, channels).
void set_audio_options(AudioReadOptions options)
Set audio-specific read options.
std::shared_ptr< Kakshya::SignalSourceContainer > create_container() override
Create a SignalSourceContainer for this file.
bool seek(const std::vector< uint64_t > &position) override
Seek to a specific position in the file.
SoundFileReader()
Construct a new SoundFileReader object.
size_t get_num_dimensions() const override
Get the number of dimensions in the audio data (typically 2: time, channel).
std::shared_ptr< FFmpegDemuxContext > m_demux
Container / format state.
std::optional< FileMetadata > get_metadata() const override
Get metadata for the currently open file.
std::vector< FileRegion > get_regions() const override
Get all regions (markers, loops, etc.) from the file.
bool is_open() const override
Check if a file is currently open.
FileReadOptions m_options
File read options used for this session.
std::vector< std::string > get_supported_extensions() const override
Get supported file extensions for this reader.
void set_target_sample_rate(uint32_t sample_rate)
Set the target sample rate for resampling.
FFmpeg-based audio file reader for MayaFlux.
File-backed audio container with complete streaming functionality.
AudioReadOptions
Audio-specific reading options.
@ CONVERT_TO_MONO
Not implemented — placeholder for channel mixer.
@ NORMALIZE
Not implemented — placeholder for future volume filter.
@ DEINTERLEAVE
Output planar (per-channel) doubles instead of interleaved.
FileReadOptions
Generic options for file reading behavior.
@ ALL
All options enabled.
@ NONE
No special options.
FileReadOptions operator&(FileReadOptions a, FileReadOptions b)
FileReadOptions operator|(FileReadOptions a, FileReadOptions b)
Generic region descriptor for any file type.