jhauberg / dledger Goto Github PK
View Code? Open in Web Editor NEWTrack and forecast dividend income
License: MIT License
Track and forecast dividend income
License: MIT License
A transaction is a transaction, period. There should be no need for a type hierarchy here (e.g. let's get rid of FutureTransaction
).
The only thing we're really interested in, is whether a date or an amount is an estimation; i.e. for a generated transaction, both date and amount would be estimates, but anything you've entered manually is not.
For example, take this preliminary record:
2020/03/01 ABC (10)
@ $ 0.1
By putting this record in your journal, you've entered some very concrete details about this transaction; namely the date, the dividend and the position. These are absolutes and not estimations. The amount, however, is missing and will be an estimate once it comes out the other end.
It would be useful being able to make this distinction for the properties of a transaction, especially for reporting (see #11).
Two journals can include each other, technically causing an infinite loop. This should be prevented.
# a.journal
include b.journal
# b.journal
include a.journal
$ dledger report a.journal
Currently, you'll probably be hit by an OSError
instead of actually running infinitely, but if we could catch and prevent this from happening, that would be better.
OSError: [Errno 24] Too many open files
Note that we can't rely on checking already parsed Transaction
's entry_attr.location
, because (like in above example), journals could have zero records.
If I have entered an amount, e.g. 2,20 kr
, then I expect this to also be the format displayed in reports and printouts. However, currently this will instead be truncated and displayed as 2,2 kr
.
It's still the correct amount, of course, but dledger
should respect the entered format.
This is a problem with the reporting function where it deduces the kind of transaction based on its properties:
Lines 169 to 180 in 9e73649
We need a better solution for determining how a transaction row should be displayed.
The order of events in main
is important, and any minor change could wind up producing different results; specifically for the report
command (e.g. print
, convert
and stats
are less affected as they are so simple).
Because there's currently no test to verify that it works as expected, it's a daunting task to work on any step related to this bit because you can't feel confident about it.
I think the approach is two-fold: 1) main
should be refactored to do less; e.g. only routing of input, and 2) a test should exist for each combination of input (e.g. expected results, not print output).
I think implementing 1) would automatically open up more testing opportunities, mitigating the need for extensive testing of 2).
Manually entered transactions (set in the future) are displayed exactly like generated/forecasted transactions, where the arrow (<
) would typically indicate "by/before this date".
However, for non-generated transactions, that is not right. You entered the transaction and gave it a date, specifically, so that date is the date and not just an estimation. This should be clearly visible, and I think we do that by just getting rid of the little arrow (or exchanging it with !
).
Here's an example (assuming date is 2020/02/13):
~ $ 0.95 < 2020/02/24 JNJ
I think this would be better:
~ $ 0.95 2020/02/24 JNJ
This seems simple, but the problem is that we currently have no way of distinguishing between a concrete, manually entered transaction and a generated transaction (e.g. Transaction
vs. FutureTransaction
). That needs to be solved first, I'd say.
Running the same report twice can produce a different ordering of the same forecasted transactions. The order should always be consistent and identical for the same input.
See psf/black
This could be a useful testing component for e.g. performance benchmarking, as it could generate much larger journals than you'd want to do by hand.
It could also generate journals that fit certain scenarios, if today's date matter to the outcome (current sample journals use static dates, and will eventually produce different outcomes as time goes by; for example, simple.journal will no longer produce forecasts by 2021).
It would be nice if it was possible for --by-ticker
to "autocomplete" so you don't have to enter entire ticker if the part you do enter will uniquely match some ticker.
For example, a journal containing records for the following tickers:
ABC
ABCD
EFG
Then, --by-ticker=E
should result in EFG
, as that is the only uniquely identifiable match.
If it is not uniquely identifiable and ambiguous, then it should still result in nothing; e.g. --by-ticker=AB
should result in no results.
input | result |
---|---|
A | None |
AB | None |
ABC | ABC |
ABCD | ABCD |
E | EFG |
EF | EFG |
This could be useful to keep active journals from growing too large, and allows "archiving" old transactions but still including them for posterity (i.e. forecasts and the future is not so dependent on old transactions, but history is always interesting to keep around).
It is already possible to "include" other journals; e.g. $ dledger report a.journal b.journal c.journal
, however, this does not work well when making use of the DLEDGER_FILE
environment variable (you can only have one journal).
Practically, an include directive should result in the same as running the above command; i.e. it's just a matter of reading all records, putting them together and sorting out their order.
I noticed an inconsistency in the projection of this monthly dividend payer when reporting --by-payout-date
. Presumably this also occurs when reporting by --by-ex-date
.
The problem is with a projection that lands in the same month as an existing transaction.
Here's the base listing, using default entry date, where everything is as expected:
$ dledger report ~/ledger.journal --by-ticker=O
REDACTED 2019/04/30 O $ 0,2260 (REDACTED)
REDACTED 2019/05/31 O $ 0,2260 (REDACTED)
REDACTED 2019/06/28 O $ 0,2265 (REDACTED)
REDACTED 2019/07/31 O $ 0,2265 (REDACTED)
REDACTED 2019/08/30 O $ 0,2265 (REDACTED)
REDACTED 2019/09/30 O $ 0,2270 (REDACTED)
REDACTED 2019/10/31 O $ 0,2270 (REDACTED)
REDACTED 2019/11/28 O $ 0,2270 (REDACTED)
REDACTED 2019/12/31 O $ 0,2275 (REDACTED)
REDACTED 2020/01/31 O $ 0,2325 (REDACTED)
~ REDACTED ! 2020/02/29 O $ 0,2325 (REDACTED) <--
~ REDACTED < 2020/03/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/04/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/05/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/06/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/07/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/08/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/09/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/10/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/11/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/12/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2021/01/31 O $ 0,2325 (REDACTED)
Now, here's a listing --by-payout-date
:
$ dledger report ~/ledger.journal --by-ticker=O --by-payout-date
REDACTED 2019/04/30 O $ 0,2260 (REDACTED)
REDACTED 2019/05/31 O $ 0,2260 (REDACTED)
REDACTED 2019/06/28 O $ 0,2265 (REDACTED)
REDACTED 2019/07/31 O $ 0,2265 (REDACTED)
REDACTED 2019/08/30 O $ 0,2265 (REDACTED)
REDACTED 2019/09/30 O $ 0,2270 (REDACTED)
REDACTED 2019/10/31 O $ 0,2270 (REDACTED)
REDACTED 2019/11/28 O $ 0,2270 (REDACTED)
REDACTED 2020/01/15 O $ 0,2275 (REDACTED)
REDACTED 2020/02/14 O $ 0,2325 (REDACTED)
~ REDACTED ! 2020/02/29 O $ 0,2325 (REDACTED) <--
~ REDACTED < 2020/03/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/04/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/05/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/06/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/07/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/08/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/09/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/10/31 O $ 0,2325 (REDACTED)
~ REDACTED < 2020/11/30 O $ 0,2325 (REDACTED)
~ REDACTED < 2021/01/15 O $ 0,2325 (REDACTED)
~ REDACTED < 2021/02/15 O $ 0,2325 (REDACTED)
Notice the marked transaction, which has been projected to a month where a transaction is already recorded and finalized. This is definitely wrong.
Though i'm not certain, I think the issue lies is within estimated_transactions
, which projects based on the latest entry and does not take payout nor ex-date into account; i.e. these fields are left blank.
Would also be beneficial to use this specific scenario to build a test.
EDIT:
Important to note, though, that in this case, only the two most recent records actually has a payout date set, the rest do not. This, of course, means there is no date to swap and only the last 2 projections can be right.
The point being, this issue would not have come up if all transactions had actually recorded the payout date. There are already diagnostics warning about this stuff missing, if --verbose
.
This problem is already "solved" by simply removing duplicate records after all input journals have been read. However, this could be an entirely unnecessary step (improving performance in all cases) in the process if it was simply not possible (i.e. ParseError
) to include the same journal path more than once.
The problem with implementing this is that currently, reading a journal is a linear, line-by-line operation. To prevent duplicate journals we would have to make gathering include directives a separate pass.
The error message will only indicate the starting line number of the transaction where an error was encountered. Take this journal:
# this file serves as an example journal for dledger
2019/02/14 AAPL (100)
$ 73 kr
Here, the 3rd line starts a transaction, but line 4 contains a syntax error (ambiguous symbol). The error message, however, doesn't make that clear:
ValueError: dledger/example/simple.journal:3 ambiguous symbol definition ('$' or 'kr'?)
This is a nitpick, but it should either be simple.journal:4
, or say something more explicit about "where" in this transaction the error occurred- maybe even both?
The scenario is this: you enter a preliminary record of a dividend transaction that you expect to hit sometime in the future. Now, that future might be far away, and once it does hit, you might have forgotten that you even entered this record.
The problem: reports will indicate that there is indeed an expected transaction, but it does not tell you that you should look for it in your journal to amend it. It looks exactly the same as if it was generated by a forecast. This may lead to multiple entries of the same transaction, which you'll probably notice once you run another report, but that bit of confusion should not have occurred in the first place.
Fractional shares is increasingly becoming a thing offered by major brokers, and might soon even become an expected feature for many users. We should probably support it as well.
Personally, I prefer the simplicity of non-fractional positions, and I'm not sure about the practicality of entering a fractional number manually in a journal, but I can also imagine a world where the whole thing is automated. So, I think, if we can support both seamlessly, then we might as well.
Technically, the one major change needed to support a fractional position is to transition Transaction.position
from an int
to a float
.
This change involves updating journal reading/writing and any calculations using position as a factor, and might have unexpected consequences down the line (there's probably occurrences of equality checks, which is a whole thing when it comes to floats...), so tests should be made to ensure that we don't impede on current functionality.
Exchange rates are only refreshed upon entering new transactions. This can quickly leave them stagnant and outdated, which affects the accuracy of forecasted estimates.
For example, if you only collect a single dividend once a year, the exchange rate will also only update once a year.
This could be perfectly acceptable, but it would also be very inaccurate. This is obviously an extreme example, but exchange rates do tend to fluctuate quite a bit, so even a difference by a month could have a significant impact on estimates.
I think it's worth considering adding a non-transaction directive to set a current exchange rate.
So, this works correctly- however, I always trip on this. My locale is Danish, as I want it to be, but I typically never input the danish version of a month. So when I go ... -per=oct
, the program abruptly breaks because there's no "October" in my localized calendar (there's "Oktober").
Maybe it should fallback to checking through month names in en-US
locale? This would work for my particular case, but it would not work if there's any ambiguity; i.e. if what you enter happens to match both calendars, but for different months. It would probably be more confusing than helpful. Maybe not using localized months at all is better?
Or, consider having a verbose diagnostic message notifying about the ambiguity, and what it chose and why (i.e. localized over english).
In the following example, the estimated amount is incorrectly shown as 91
, though the stored amount is really 91,2
(e.g. 48 * 1,90 = 91,2
).
# case reproduction journal; fiddle with dates if needed
2022/02/22 AAA (10)
[2022/02/22] 13 kr
@ [2022/02/20] 1,30 kr
2023/02/26 AAA (48)
[2023/02/25]
@ [2023/02/21] 1,90 kr
$ dledger report --by-ticker=AAA
13 kr 2022/02/22 AAA (10) 1,30 kr
~ 91 kr ! 2023/02/26 AAA (48) 1,90 kr
This is a side-effect of always using the "preferred" number of decimal places for a given currency, and only considering those amounts read from a journal. This results in generated amounts (i.e. forecasted records) being rounded to fit.
In this case, only "13 kr" is a "real" entry, and thus 0 decimal places is considered the preference.
Line 207 in cd49170
Though the amount is just an estimate and will be corrected once the record is finalized, I do still think we could handle and display this better.
However, if we were to try determining the actual number of decimals for the estimated amount, it is likely that we end up with a report looking like this (due to floating point precision), which is definitely not any better:
13,000000000000 kr 2022/02/22 AAA (10) 1,30 kr
~ 91,199999999999 kr <~ 2023/02/26 AAA (48) 1,90 kr
As it stands, you are not required to provide any currency/symbol for entered amounts.
For example, you can have this transaction:
2019/02/14 AAPL (100)
73
This leads to a few issues, though:
We can fix all of the above by being strict and requiring a symbol for any cash amount entered. Personally, I also think this makes the journal read better.
When you enter a preliminary record for a stock of which you just opened a position in, the estimated amount is always assumed to be in currency of the dividend.
However, you may know ahead of time that an exchange will happen into another currency, but the only way to report that is by using --in-currency=X
, which is OK, but maybe this should be something you can indicate in the transaction itself too?
For example, here's a preliminary record:
2020/02/13 MMM (10)
[2020/03/12] @ $ 1,47
In this particular case, I know ahead of time that the money payout will be in DKK, but it will be reported as a $ amount, because that's the only possible inference, given zero past transactions:
~ $ 14,7 ! 2020/02/13 MMM [2020/03/12]
A potential solution could be to allow entering currency, but without the actual amount:
2020/02/13 MMM (10)
[2020/03/12] DKK @ $ 1,47
This says "I expect this transaction to be paid out on March 12, amount in DKK"
Right now, this resolves to dledger.journal.ParseError: .journal:1 invalid value ('')
, because a number is assumed present if there's any non-whitespace.
I guess one issue with this approach is the ambiguity in how the amount will be formatted (e.g. does the currency symbol go on left/right side? with/without padding?). I suppose we could look to other transactions using same symbol and copy that, and otherwise just go with a probable default/guess.
It's not uncommon that companies perform stock splits. It's a technical detail and it typically doesn't affect the end result. As such, it's not a problem for dledger
, because it just reports the numbers fed to it.
Actually, that's not entirely true; a split does affect dledger
, because forecasts could be affected by reading into a false-positive negative trend (dividend appears to have been reduced; project accordingly).
More importantly, it causes detailed reports to be more difficult to read and reason about without prior knowledge.
For example, AAPL went through a 4-to-1 split on August 28, 2020, which results in reports like the one below:
$ 7.3 2019/02/15 AAPL (10) $ 0.730
$ 7.7 2019/05/17 AAPL (10) $ 0.770
$ 7.7 2019/08/19 AAPL (10) $ 0.770
$ 7.7 2019/11/15 AAPL (10) $ 0.770
$ 7.7 2020/02/14 AAPL (10) $ 0.770
$ 8.2 2020/05/15 AAPL (10) $ 0.820
$ 8.2 2020/08/14 AAPL (10) $ 0.820
$ 8.2 2020/11/13 AAPL (40) $ 0.205 # split 4-to-1 prior to this payout; note position and dividend
$ 8.2 2021/02/12 AAPL (40) $ 0.205
~ $ 8.2 <~ 2021/05/15 AAPL (40) $ 0.205
~ $ 8.2 <~ 2021/08/15 AAPL (40) $ 0.205
~ $ 8.2 <~ 2021/11/15 AAPL (40) $ 0.205
~ $ 8.2 <~ 2022/02/15 AAPL (40) $ 0.205
Note that without knowledge of the split, this could both read as if (1) the dividend was reduced, and (2), that more shares were bought, increasing the position. However, neither of these points are true.
There's simply no indication about the split at all, because this is not something dledger
can deduce from its records.
What i'd like to see, is something like this instead:
$ 7.3 2019/02/15 AAPL (40) $ 0.1825 # position: 10 x 4
$ 7.7 2019/05/17 AAPL (40) $ 0.1925
$ 7.7 2019/08/19 AAPL (40) $ 0.1925
$ 7.7 2019/11/15 AAPL (40) $ 0.1925
$ 7.7 2020/02/14 AAPL (40) $ 0.1925
$ 8.2 2020/05/15 AAPL (40) $ 0.2050
$ 8.2 2020/08/14 AAPL (40) $ 0.2050 # dividend: 0.820 / 4
$ 8.2 2020/11/13 AAPL (40) $ 0.2050
$ 8.2 2021/02/12 AAPL (40) $ 0.2050
~ $ 8.2 <~ 2021/05/15 AAPL (40) $ 0.2050
~ $ 8.2 <~ 2021/08/15 AAPL (40) $ 0.2050
~ $ 8.2 <~ 2021/11/15 AAPL (40) $ 0.2050
~ $ 8.2 <~ 2022/02/15 AAPL (40) $ 0.2050
This report does not indicate anything about splits, because it's not important here; it's only important that the numbers are synchronized; i.e. even though you didn't start off your position with 40 shares, for all intents and purposes, you might as well have, relative to the development of outstanding shares.
For reference, see https://seekingalpha.com/symbol/AAPL/dividends/history as its historical chart includes dividend adjusted for splits.
Support for splits would start in the journal. A syntax is required to explicitly indicate the kind of split to perform. Seems obvious to extend the existing position syntax. For example:
2020/08/28 AAPL (10 x 4/1) # split 4-to-1
and maybe by shorthand:
2020/08/28 AAPL (x 4/1) # split 4-to-1
Note that it would not be good enough to simply put in the new position here, as that would indicate a position increase; i.e. the payout/dividend would rise accordingly. Currently, to apply a split, you do just that, but only at next payout, so you can also apply the adjusted dividend.
However, that is not the desired approach in this case; we want to specifically indicate the split so that it can be applied to both past and future records. This means that it has to be an explicit directive for dledger
to understand the intent.
Some additional examples:
2020/12/10 ROL (10 x 3/2) # split 3-to-2 => 10x(3/2) => 15
2020/01/01 ABC (10 x 2/3) # reverse split 2-to-3 => 10x(2/3) => 6.6666667
Note the fractional position in the last example. How would we handle this? Typically, a shareholder would receive cash corresponding to the remainder, and end up with a flat 6 shares. However, this is not necessarily the case, and dledger
can't know whether or not it is.
Similarly, we have a problem in regards to decimal places. The program prefer to use what was put in by hand, so in the AAPL case, the adjusted dividend would be truncated to e.g. $ 0.193
(due to real input of $ 0.205
; or even worse, $ 0.19
if input was $ 0.82
) when in fact it should be $ 0.1925
.
The decimal place problem goes deep, and is probably not an easy fix.
EDIT:
Of course, the easiest fix of all is to adjust for splits manually, editing each past record in your journal... But, 1) that's a ton of work and error-prone (buy/sell transactions could easily mess things up), and 2), feels like changing history, which I, personally, would prefer leaving as is (I think?).
A shorthand like tomorrow
is even mentioned in the MANUAL, which is just confusing considering it's not a thing yet.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.