MayaFlux 0.4.0
Digital-First Multimedia Processing Framework
Loading...
Searching...
No Matches
SoundFileReader.hpp
Go to the documentation of this file.
1#pragma once
2
3#include "FileReader.hpp"
4
7
9
10namespace MayaFlux::Kakshya {
11class DynamicSoundStream;
12}
13
14namespace MayaFlux::IO {
15
16/**
17 * @enum AudioReadOptions
18 * @brief Audio-specific reading options
19 */
20enum class AudioReadOptions : uint32_t {
21 NONE = 0,
22 NORMALIZE = 1 << 0, ///< Not implemented — placeholder for future volume filter.
23 CONVERT_TO_MONO = 1 << 2, ///< Not implemented — placeholder for channel mixer.
24 DEINTERLEAVE = 1 << 3, ///< Output planar (per-channel) doubles instead of interleaved.
25 ALL = 0xFFFFFFFF
26};
27
29{
30 return static_cast<AudioReadOptions>(static_cast<uint32_t>(a) | static_cast<uint32_t>(b));
31}
32
34{
35 return static_cast<AudioReadOptions>(static_cast<uint32_t>(a) & static_cast<uint32_t>(b));
36}
37
38/**
39 * @class SoundFileReader
40 * @brief FFmpeg-based audio file reader for MayaFlux
41 *
42 * SoundFileReader provides a high-level interface for reading and decoding audio files using FFmpeg.
43 * It supports a wide range of audio formats, automatic sample format conversion to double precision,
44 * resampling, metadata extraction, region/marker extraction, and streaming/seekable access.
45 *
46 * Key Features:
47 * - Format detection and demuxing via libavformat
48 * - Audio decoding via libavcodec
49 * - Sample format conversion and resampling via libswresample (always outputs double)
50 * - Metadata and region extraction from FFmpeg's parsed structures
51 * - Seeking and timestamp handling via FFmpeg's APIs
52 * - Automatic creation and population of Kakshya::SoundFileContainer for downstream processing
53 * - Thread-safe access for reading and metadata queries
54 *
55 * Usage:
56 * SoundFileReader reader;
57 * if (reader.open("file.wav")) {
58 * auto metadata = reader.get_metadata();
59 * auto all_data = reader.read_all();
60 * auto container = reader.create_container();
61 * // ...
62 * reader.close();
63 * }
64 *
65 * All audio data is converted to double precision for internal processing.
66 * The reader can output data in either interleaved or deinterleaved (planar) layout.
67 */
69public:
70 /**
71 * @brief Construct a new SoundFileReader object.
72 * Initializes internal state and prepares for file operations.
73 */
75
76 /**
77 * @brief Destroy the SoundFileReader object.
78 * Cleans up FFmpeg resources and internal state.
79 */
80 ~SoundFileReader() override;
81
82 /**
83 * @brief Check if this reader can open the given file.
84 * @param filepath Path to the file.
85 * @return True if the file can be read, false otherwise.
86 */
87 bool can_read(const std::string& filepath) const override;
88
89 /**
90 * @brief Open an audio file for reading.
91 * @param filepath Path to the file.
92 * @param options File read options.
93 * @return True if the file was opened successfully.
94 */
95 bool open(const std::string& filepath, FileReadOptions options = FileReadOptions::ALL) override;
96
97 /**
98 * @brief Open an audio stream from an already-constructed demux and stream context.
99 *
100 * Secondary open path for callers that have already probed the demuxer and
101 * opened an AudioStreamContext (e.g. VideoFileReader sharing its m_demux and
102 * m_audio). No avformat_open_input or AudioStreamContext::open is performed —
103 * both contexts are adopted as-is.
104 *
105 * The filepath is required only for filesystem-level metadata (file_size,
106 * modification_time) populated during build_metadata. It must point to the
107 * same file already open in the demux context. Pass an empty string only if
108 * FileReadOptions::EXTRACT_METADATA is not set.
109 *
110 * All subsequent SoundFileReader operations — metadata, regions, read_all,
111 * load_into_container — behave identically to the filepath-based open().
112 *
113 * Both contexts must remain valid for the lifetime of this reader.
114 *
115 * @param demux Shared, already-open demux context.
116 * @param audio Shared, already-open and valid audio stream context.
117 * @param filepath Path to the source file (used for filesystem metadata only).
118 * @param options File read options (metadata, regions, etc.).
119 * @return True if both contexts are valid and setup succeeded.
120 */
121 bool open_from_demux(std::shared_ptr<FFmpegDemuxContext> demux,
122 std::shared_ptr<AudioStreamContext> audio,
123 const std::string& filepath,
125
126 /**
127 * @brief Close the currently open file and release resources.
128 */
129 void close() override;
130
131 /**
132 * @brief Check if a file is currently open.
133 * @return True if a file is open, false otherwise.
134 */
135 bool is_open() const override;
136
137 /**
138 * @brief Get metadata for the currently open file.
139 * @return Optional FileMetadata structure.
140 */
141 std::optional<FileMetadata> get_metadata() const override;
142
143 /**
144 * @brief Get all regions (markers, loops, etc.) from the file.
145 * @return Vector of FileRegion structures.
146 */
147 std::vector<FileRegion> get_regions() const override;
148
149 /**
150 * @brief Read the entire audio file into memory.
151 * @return DataVariant containing audio data as std::vector<double>.
152 */
153 std::vector<Kakshya::DataVariant> read_all() override;
154
155 /**
156 * @brief Read a specific region from the file.
157 * @param region Region to read.
158 * @return DataVariant containing region data.
159 */
160 std::vector<Kakshya::DataVariant> read_region(const FileRegion& region) override;
161
162 /**
163 * @brief Load an audio file into a size-bounded DynamicSoundStream.
164 *
165 * Opens the file, reads up to @p max_frames frames into a fully resident
166 * DynamicSoundStream with auto-resize disabled, then closes the file.
167 * No DataProcessor is configured on the result; that is the caller's
168 * responsibility (see IOManager::configure_audio_processor).
169 *
170 * @param filepath Path to the audio file.
171 * @param max_frames Upper bound on frame count. Defaults to 5 seconds at
172 * the target sample rate.
173 * @param truncate If true, silently truncate files that exceed max_frames
174 * with an MF_WARN. If false, return nullptr on excess.
175 * @return Populated DynamicSoundStream, or nullptr on failure.
176 */
177 [[nodiscard]] std::shared_ptr<Kakshya::DynamicSoundStream> load_bounded(
178 const std::string& filepath,
179 uint64_t max_frames = 0,
180 bool truncate = false);
181
182 /**
183 * @brief Create a SignalSourceContainer for this file.
184 * @return Shared pointer to a new SignalSourceContainer.
185 */
186 std::shared_ptr<Kakshya::SignalSourceContainer> create_container() override;
187
188 /**
189 * @brief Load file data into an existing SignalSourceContainer.
190 * @param container Target container.
191 * @return True if loading succeeded.
192 */
193 bool load_into_container(std::shared_ptr<Kakshya::SignalSourceContainer> container) override;
194
195 /**
196 * @brief Get the current read position in the file.
197 * @return Vector of dimension indices (e.g., frame index).
198 */
199 std::vector<uint64_t> get_read_position() const override;
200
201 /**
202 * @brief Seek to a specific position in the file.
203 * @param position Vector of dimension indices.
204 * @return True if seek succeeded.
205 */
206 bool seek(const std::vector<uint64_t>& position) override;
207
208 /**
209 * @brief Get supported file extensions for this reader.
210 * @return Vector of supported extensions (e.g., "wav", "flac").
211 */
212 std::vector<std::string> get_supported_extensions() const override;
213
214 /**
215 * @brief Get the C++ type of the data returned by this reader.
216 * @return Type index for std::vector<double>.
217 */
218 std::type_index get_data_type() const override { return typeid(std::vector<double>); }
219
220 /**
221 * @brief Get the C++ type of the container returned by this reader.
222 * @return Type index for Kakshya::SoundFileContainer.
223 */
224 std::type_index get_container_type() const override { return typeid(Kakshya::SoundFileContainer); }
225
226 /**
227 * @brief Get the last error message encountered by the reader.
228 * @return Error string.
229 */
230 std::string get_last_error() const override;
231
232 /**
233 * @brief Check if the reader supports streaming access.
234 * @return True if streaming is supported.
235 */
236 bool supports_streaming() const override { return true; }
237
238 /**
239 * @brief Get the preferred chunk size for streaming reads.
240 * @return Preferred chunk size in frames, typically 4096.
241 */
242 uint64_t get_preferred_chunk_size() const override { return 4096; }
243
244 /**
245 * @brief Get the number of dimensions in the audio data (typically 2: time, channel).
246 * @return Number of dimensions.
247 */
248 size_t get_num_dimensions() const override;
249
250 /**
251 * @brief Get the size of each dimension (e.g., frames, channels).
252 * @return Vector of dimension sizes.
253 */
254 std::vector<uint64_t> get_dimension_sizes() const override;
255
256 /**
257 * @brief Read a specific number of frames from the file.
258 * @param num_frames Number of frames to read.
259 * @param offset Frame offset from beginning.
260 * @return DataVariant containing std::vector<double>.
261 */
262 std::vector<Kakshya::DataVariant> read_frames(uint64_t num_frames, uint64_t offset = 0);
263
264 /**
265 * @brief Set audio-specific read options.
266 * @param options Audio read options (e.g., DEINTERLEAVE).
267 */
269
270 /**
271 * @brief Set the target sample rate for resampling.
272 * @param sample_rate Target sample rate (0 = no resampling).
273 */
274 void set_target_sample_rate(uint32_t sample_rate) { m_target_sample_rate = sample_rate; }
275
276private:
277 // =========================================================================
278 // Contexts (composition — Option B)
279 // =========================================================================
280
281 std::shared_ptr<FFmpegDemuxContext> m_demux; ///< Container / format state.
282 std::shared_ptr<AudioStreamContext> m_audio; ///< Codec + resampler state.
283
284 mutable std::shared_mutex m_context_mutex; ///< Guards both context pointers.
285
286 // =========================================================================
287 // Reader state
288 // =========================================================================
289
290 /**
291 * @brief Path to the currently open file.
292 */
293 std::string m_filepath;
294
295 /**
296 * @brief File read options used for this session.
297 */
299
300 /**
301 * @brief Audio-specific read options.
302 */
304
305 /**
306 * @brief Last error message encountered.
307 */
308 mutable std::string m_last_error;
309
310 /**
311 * @brief Mutex for thread-safe error message access.
312 */
313 mutable std::mutex m_error_mutex;
314
315 /**
316 * @brief Cached file metadata.
317 */
318 mutable std::optional<FileMetadata> m_cached_metadata;
319
320 /**
321 * @brief Cached file regions (markers, loops, etc.).
322 */
323 mutable std::vector<FileRegion> m_cached_regions;
324
325 /**
326 * @brief Current frame position for reading.
327 */
328 std::atomic<uint64_t> m_current_frame_position { 0 };
329
330 /**
331 * @brief Target sample rate for resampling (0 = use source rate).
332 */
334
335 /**
336 * @brief Mutex for thread-safe metadata access.
337 */
338 mutable std::mutex m_metadata_mutex;
339
340 // =========================================================================
341 // Internal helpers
342 // =========================================================================
343 /**
344 * @brief Decode num_frames PCM frames starting at offset.
345 * @param ctx FFmpeg context.
346 * @param num_frames Number of frames to decode.
347 * @param offset Frame offset from beginning.
348 * @return DataVariant containing decoded data.
349 *
350 * Caller must hold at least a shared lock on m_context_mutex.
351 */
352 std::vector<Kakshya::DataVariant> decode_frames(
353 const std::shared_ptr<FFmpegDemuxContext>& demux,
354 const std::shared_ptr<AudioStreamContext>& audio,
355 uint64_t num_frames,
356 uint64_t offset);
357
358 /**
359 * @brief Seek the demuxer and flush the codec to the given frame position.
360 * @param ctx FFmpeg context.
361 * @param frame_position Target frame position.
362 * @return True if seek succeeded.
363 *
364 * Caller must hold an exclusive lock on m_context_mutex.
365 */
366 bool seek_internal(const std::shared_ptr<FFmpegDemuxContext>& demux,
367 const std::shared_ptr<AudioStreamContext>& audio,
368 uint64_t frame_position);
369
370 /**
371 * @brief Build and cache FileMetadata from both contexts.
372 */
373 void build_metadata(const std::shared_ptr<FFmpegDemuxContext>& demux,
374 const std::shared_ptr<AudioStreamContext>& audio) const;
375
376 /**
377 * @brief Build and cache FileRegion list from both contexts.
378 */
379 void build_regions(const std::shared_ptr<FFmpegDemuxContext>& demux,
380 const std::shared_ptr<AudioStreamContext>& audio) const;
381
382 /**
383 * @brief Set the last error message.
384 * @param error Error string.
385 */
386 void set_error(const std::string& error) const;
387
388 /**
389 * @brief Clear the last error message.
390 */
391 void clear_error() const;
392};
393
394} // namespace MayaFlux::IO
size_t a
size_t b
glm::vec3 position
Abstract interface for reading various file formats into containers.
std::vector< Kakshya::DataVariant > read_all() override
Read the entire audio file into memory.
void close() override
Close the currently open file and release resources.
void build_regions(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio) const
Build and cache FileRegion list from both contexts.
std::string get_last_error() const override
Get the last error message encountered by the reader.
uint64_t get_preferred_chunk_size() const override
Get the preferred chunk size for streaming reads.
uint32_t m_target_sample_rate
Target sample rate for resampling (0 = use source rate).
bool supports_streaming() const override
Check if the reader supports streaming access.
bool open(const std::string &filepath, FileReadOptions options=FileReadOptions::ALL) override
Open an audio file for reading.
std::mutex m_error_mutex
Mutex for thread-safe error message access.
void build_metadata(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio) const
Build and cache FileMetadata from both contexts.
std::shared_ptr< AudioStreamContext > m_audio
Codec + resampler state.
bool load_into_container(std::shared_ptr< Kakshya::SignalSourceContainer > container) override
Load file data into an existing SignalSourceContainer.
std::shared_mutex m_context_mutex
Guards both context pointers.
std::shared_ptr< Kakshya::DynamicSoundStream > load_bounded(const std::string &filepath, uint64_t max_frames=0, bool truncate=false)
Load an audio file into a size-bounded DynamicSoundStream.
bool seek_internal(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio, uint64_t frame_position)
Seek the demuxer and flush the codec to the given frame position.
bool can_read(const std::string &filepath) const override
Check if this reader can open the given file.
std::mutex m_metadata_mutex
Mutex for thread-safe metadata access.
std::vector< Kakshya::DataVariant > decode_frames(const std::shared_ptr< FFmpegDemuxContext > &demux, const std::shared_ptr< AudioStreamContext > &audio, uint64_t num_frames, uint64_t offset)
Decode num_frames PCM frames starting at offset.
std::string m_last_error
Last error message encountered.
std::vector< uint64_t > get_read_position() const override
Get the current read position in the file.
void set_error(const std::string &error) const
Set the last error message.
std::atomic< uint64_t > m_current_frame_position
Current frame position for reading.
std::type_index get_container_type() const override
Get the C++ type of the container returned by this reader.
std::vector< Kakshya::DataVariant > read_region(const FileRegion &region) override
Read a specific region from the file.
std::vector< Kakshya::DataVariant > read_frames(uint64_t num_frames, uint64_t offset=0)
Read a specific number of frames from the file.
AudioReadOptions m_audio_options
Audio-specific read options.
std::optional< FileMetadata > m_cached_metadata
Cached file metadata.
bool open_from_demux(std::shared_ptr< FFmpegDemuxContext > demux, std::shared_ptr< AudioStreamContext > audio, const std::string &filepath, FileReadOptions options=FileReadOptions::ALL)
Open an audio stream from an already-constructed demux and stream context.
std::vector< FileRegion > m_cached_regions
Cached file regions (markers, loops, etc.).
void clear_error() const
Clear the last error message.
~SoundFileReader() override
Destroy the SoundFileReader object.
std::type_index get_data_type() const override
Get the C++ type of the data returned by this reader.
std::string m_filepath
Path to the currently open file.
std::vector< uint64_t > get_dimension_sizes() const override
Get the size of each dimension (e.g., frames, channels).
void set_audio_options(AudioReadOptions options)
Set audio-specific read options.
std::shared_ptr< Kakshya::SignalSourceContainer > create_container() override
Create a SignalSourceContainer for this file.
bool seek(const std::vector< uint64_t > &position) override
Seek to a specific position in the file.
SoundFileReader()
Construct a new SoundFileReader object.
size_t get_num_dimensions() const override
Get the number of dimensions in the audio data (typically 2: time, channel).
std::shared_ptr< FFmpegDemuxContext > m_demux
Container / format state.
std::optional< FileMetadata > get_metadata() const override
Get metadata for the currently open file.
std::vector< FileRegion > get_regions() const override
Get all regions (markers, loops, etc.) from the file.
bool is_open() const override
Check if a file is currently open.
FileReadOptions m_options
File read options used for this session.
std::vector< std::string > get_supported_extensions() const override
Get supported file extensions for this reader.
void set_target_sample_rate(uint32_t sample_rate)
Set the target sample rate for resampling.
FFmpeg-based audio file reader for MayaFlux.
File-backed audio container with complete streaming functionality.
AudioReadOptions
Audio-specific reading options.
@ CONVERT_TO_MONO
Not implemented — placeholder for channel mixer.
@ NORMALIZE
Not implemented — placeholder for future volume filter.
@ DEINTERLEAVE
Output planar (per-channel) doubles instead of interleaved.
FileReadOptions
Generic options for file reading behavior.
@ ALL
All options enabled.
@ NONE
No special options.
FileReadOptions operator&(FileReadOptions a, FileReadOptions b)
FileReadOptions operator|(FileReadOptions a, FileReadOptions b)
Generic region descriptor for any file type.