RFC: a convention for error handling

This RFC proposes a convention for structuring methods in SciRust 
which can cater to the conflicting needs of efficiency, easy of use
and effective error handling.

For the impatient: 

```
// Efficient access without bound checks
unsafe fn get_unchecked(&self, r : usize, c : usize) -> T;
// Safe access with bound checks, raises error if invalid address
fn get_checked(&self, r : usize, c : usize) -> Result<T, Error>;
// User friendly version. Panics in case of error
fn get(&self, r : usize, c : usize) -> T;

// Efficient modification without bound checks
unsafe fn set_unchecked(&mut self, r : usize, c : usize, value : T);

// Safe modification with bound check
fn set(&mut self, r : usize, c : usize, value : T);
```
# Detailed discussion

The audience of SciRust can be possibly divided into
two usage scenarios.
- A script style usage, where the objective is to quickly
  do some numerical experiment, get the results and analyze them.
- A library development usage, where more professional libraries
  would be built on top of fundamental building blocks provided
  by SciRust (these may be other modules shipped in SciRust itself).

While the first usage scenario is important for getting new users hooked
to the library, the second usage scenario is also important for justifying
why Rust should be used for scientific software development compared
to other scientific computing platforms.

In context of the two usage scenarios, the design of SciRust has three conflicting goals:
- Ease of use
- Efficiency
- Well managed error handling

While ease of use is important for script style usage,
efficiency and well managed error handling are important
for serious software development on top of core components
provided by SciRust.

We will consider the example of a `get(r,c)` method
on a matrix object to discuss these conflicting goals.
Please note that `get` is just a representative method
for this discussion. The design ideas can be applied in
many different parts of SciRust once accepted.

If `get` is being called in a loop, usually the code
around it can ensure that the conditions for accessing
data within the boundary of the matrix are met correctly.
Thus, a bound checking within the implementation of `get`
is just an extra overhead. 

While this design is good for writing efficient software,
it can lead to a number of memory related bugs and goes
against the fundamental philosophy of Rust (Safety first).
There are actually two different options for error handling:
- Returning either `Option<T>` or `Result<T, Error>`.
- Using the `panic` mechanism.

`Option<T>` or `Result<T, Error>` provides the users a
fine grained control over what to do when an error occurs.
This is certainly the Rusty way of doing things. At the
same time, both of these return types make the user code
more complicated. One has to add extra calls to `.unwrap()`
even if one is sure that the function is not going to fail.

Users of scientific computing tend to prefer an environment
where they can get more work done with less effort. This is
a reason of the success of specialized environments like
MATLAB. Open source environments like Python (NumPy, SciPy)
try to achieve something similar.

While SciRust doesn't intend to compete at the level of 
simplicity provided by MATLAB/Python environments, it does
intend to take an extra effort wherever possible to address
the ease of use goal.  
In this context,  the return type of a `getter` should
be just the value type `T`. This can be achieved
safely by using a panic if the access boundary 
conditions are not met.

The discussion above suggests up to 3 possible ways of 
implementing methods like `get`.
- An unchecked (and unsafe) version for high efficiency code
  where the calling code is responsible for ensuring that
  the necessary requirements for correct execution of the 
  method are being met.
- A safe version which returns either `Option<T>` or
  `Result<T, Error>` which can be used for professional
  software development where the calling code has full control
  over error handling.
- Another safe version which panics in case of error but provides
  an API which is simpler to use for writing short scientific
  computing scripts.
# Proposed convention

We propose that a method for which these variations 
need to be supported, should follow the convention defined below:
- A `method_unchecked` version should provide basic implementation
  of the method. This should assume that necessary conditions
  for successful execution of the methods are already being
  ensured by the calling code. The unchecked version of method
  MUST be marked `unsafe`. This ensures that the calling code
  knows that it is responsible for ensuring the right conditions
  for calling the unchecked method.
- A `method_checked` version should be implemented on top of
  a `method_unchecked` method. The checked version should
  check for all the requirements for calling the method safely.
  The return type should be either `Option<T>` or
  `Result<T, Error>`. In case the required conditions for
  calling the method are not met, a `None` or `Error`
  should be returned. Once the required conditions are met, 
  `method_unchecked` should be called to get the result
  which would be wrapped inside `Option` or `Result`.
- A `method` version should be built on top of `method_checked` version. 
  It should simply attempt to unwrap
  the value returned by `method_checked` and return as `T`.
  If `method_checked` returns an error or None, this version
  should panic.

First two versions are suitable for professional development
where most of the time we need a safe API while at some times
we need an unsafe API for efficient implementation.
The third version is suitable for script style usage scenario.

The convention has been illustrated in the three versions of
`get` at the beginning of this document.
# API bloat

While this convention is expected to lead into an API bloat,
but if the convention is followed properly across the library,
then it should be easy to follow (both from the perspective
of users of the library and from the perspective of developers
of the library).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: a convention for error handling #25

Detailed discussion

Proposed convention

API bloat

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: a convention for error handling #25

Description

Detailed discussion

Proposed convention

API bloat

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions