TL; DR:
How refpts
are handled is inconsistent at best and erroneous at worst; furthermore, the docstrings only complicate things by contradicting each other. Please can someone outline the intended use syntax for refpts
?
I have discovered an issue with the consistency of refpts
, although refpoint related this is different to the issue discussed on #75.
The issue is primarily due to there being no max length check in all of our refpts
handling, and the inconsistency/ambiguity of the docstrings of functions that use refpts
.
On the first point:
Since, in uint32_refpts
, we currently raise a ValueError if numerical refpts
contain duplicates or are not in ascending order, therefore it would be intuitive to have a maximum length check.
Summary of behaviour:
============================================================================================
| refpts = | returns / raises |
============================================================================================
| numpy.array([0,...,2*len(ring)], | Error raised later in function due to indexing |
| dtype=uint32) | / calculation issues, or data of length |
| | 2*len(ring) is sucessfully returned. |
--------------------------------------------------------------------------------------------
| numpy.ones(len(ring), dtype=bool) | data at entrance to every element in the ring |
--------------------------------------------------------------------------------------------
| numpy.ones(len(ring)+1, dtype=bool) | above + data at the non-existant len(ring)+1th |
| | element |
--------------------------------------------------------------------------------------------
| numpy.ones(2*len(ring), dtype=bool) | ValueError: refpts must be ascending |
--------------------------------------------------------------------------------------------
| [0, 1, ..., len(ring)-2, len(ring)-1] | data at entrance to every element in the ring |
--------------------------------------------------------------------------------------------
| [0, 1, ..., len(ring)-1, len(ring)] | above + data at the non-existant len(ring)+1th |
| | element |
--------------------------------------------------------------------------------------------
| [0, 1, ..., 2*len(ring)-1, 2*len(ring)] | ValueError: refpts must be ascending |
--------------------------------------------------------------------------------------------
| [2*len(ring)] | data at entrance to ring i.e. eqivelent to |
| | putting refpts=0 |
============================================================================================
The first isssue is that any array with dtype=numpy.uint32
remains unparsed and so either causes issues in calculations or indexing later on in the function, or even worse it returns data at elements beyond the end of the lattice. The behaviour differs by function but is consistent inside each function.
Secondly because of the line refs = numpy.array([i if (i == n_elements) else i % n_elements for i in numpy.ravel(refs)], dtype=numpy.uint32)
in uint32_refpts
, which is used to support negative indexing, any refpoint > n_elements is converted into a smaller refpoint, usually a duplicate.
In relation to the second:
The few functions that have an explanation of the function of refpts
conflict with each other in those explanations.
get_twiss
, linopt
and ohmi_envelope
give the following description of refpts
:
refpts elements at which data is returned. It can be
1) an integer (0 indicating the first element)
2) a list of integers
3) a numpy array of booleans as long as ring where
selected elements are true
Defaults to None
...
All values are given at the entrance of each element specified in refpts
lattice_pass
further confuses the issue with:
Note:
* lattice_pass(lattice, r_in, refpts=len(line)) is the same as
lattice_pass(lattice, r_in) since the reference point len(line) is the
exit of the last element
* lattice_pass(lattice, r_in, refpts=0) is a copy of r_in since the
reference point 0 is the entrance of the first element
Finally bool_refpts
introduces another contradiction:
Return a boolean numpy array of length n_elements + 1 where True elements
are selected. This is used for indexing a lattice using True or False
values.
N.B. find_orbit4
, find_sync_orbit
, find_orbit6
, find_m44
, find_m66
, and patpass
all make use of refpts
without any explanation in their docstrings.
My conclusions:
- All sources and Pythonic common sense both correctly say that
refpts
should be indexed from 0.
- As for maximum length, although the docstrings contradict each other, I think n_elements+1 makes sense; as despite
refpts=n_elements+1
being, logically speaking, the same as refpts=0
for some of the physics data it is useful to have because it indicates the data at the end of the last element/ring which may well be different to the data at the start of the first element.
- Indexing beyond the maximum length, of any kind, should not be allowed and should raise a descriptive error message.
- A single unified explanation of the function and syntax of
refpts
should either be in all relevant docstrings or they should reference or inherit one description.
- I would also like to rewrite
uint32_refpts
and put the refpts
checking logic into a separate parsing function, that way it could be used more generally and uint32_refpts
would contain purely the conversion to uint32; or perhaps a general refpts conversion function that takes an output_dtype
argument might be better?
Please can someone confirm whether the conclusions that I've outlined are correct; any suggestions or insights about the refpts
handling redesign are also welcome.
I'd also be interested in the equivalent behaviour in Matlab, if anyone knows off-hand, I may look into this fully later on.