Exercise: Data Pipeline Aggregator

Process an array of nullable data points across a multi-file project to compute multiple metrics simultaneously using out parameters.

Problem statement

In data science pipelines, incoming datasets often contain corrupted or missing values. The objective is to build a high-performance utility that scans a dataset exactly once and ignores missing data. It must extract the minimum, maximum, and average values for further analysis.

Task requirements

  • Create a utility class inside a dedicated file to process the data.

  • Implement a method that accepts an array of nullable doubles.

  • Calculate the minimum, maximum, and average of the valid (non-null) data points in a single iteration.

  • Output these three metrics directly to the caller without allocating a dedicated return object.

  • Handle the edge case where the array is empty or contains only null values by returning false (and true if processing succeeds).

Constraints

  • Use a static class and a static method, and place them in the DataScience namespace within their own file.

  • Use double?[] for the input array.

  • Use the out modifier to return the min, max, and average values alongside the method's boolean return type.

  • Use a single foreach or for loop to achieve a single-pass calculation.

Good luck trying the exercise! If you’re unsure how to proceed, check the “Solution” tab above.

Get hints

  • Initialize the local tracking minimum to double.MaxValue and the maximum to double.MinValue before the loop.

  • Use the HasValue property or the is not null pattern to filter out corrupted data points during the iteration.

  • An out parameter must be assigned a value before the method exits. Assign them 0 at the very beginning of the method to satisfy the compiler in case you need to return false early.

Exercise: Data Pipeline Aggregator

Process an array of nullable data points across a multi-file project to compute multiple metrics simultaneously using out parameters.

Problem statement

In data science pipelines, incoming datasets often contain corrupted or missing values. The objective is to build a high-performance utility that scans a dataset exactly once and ignores missing data. It must extract the minimum, maximum, and average values for further analysis.

Task requirements

  • Create a utility class inside a dedicated file to process the data.

  • Implement a method that accepts an array of nullable doubles.

  • Calculate the minimum, maximum, and average of the valid (non-null) data points in a single iteration.

  • Output these three metrics directly to the caller without allocating a dedicated return object.

  • Handle the edge case where the array is empty or contains only null values by returning false (and true if processing succeeds).

Constraints

  • Use a static class and a static method, and place them in the DataScience namespace within their own file.

  • Use double?[] for the input array.

  • Use the out modifier to return the min, max, and average values alongside the method's boolean return type.

  • Use a single foreach or for loop to achieve a single-pass calculation.

Good luck trying the exercise! If you’re unsure how to proceed, check the “Solution” tab above.

Get hints

  • Initialize the local tracking minimum to double.MaxValue and the maximum to double.MinValue before the loop.

  • Use the HasValue property or the is not null pattern to filter out corrupted data points during the iteration.

  • An out parameter must be assigned a value before the method exits. Assign them 0 at the very beginning of the method to satisfy the compiler in case you need to return false early.

C# 14.0
namespace DataScience;
// TODO: Define a public static class named DataAggregator
// TODO: Inside, create a static method named ProcessData
// TODO: The method should return a bool and accept:
// - double?[] data
// - out double min
// - out double max
// - out double average