Giter Site home page Giter Site logo

Comments (11)

agnivade avatar agnivade commented on May 25, 2024 2

I'll take this one.

from mattermost-plugin-github.

agnivade avatar agnivade commented on May 25, 2024 1

A rough design sketch:

Step 1: Extract the permalinks from the message

I think a regex is the best way to easily capture all the links without creating a hand-written link parser. This is what I have now https?://(?P<haswww>www.)?github.com/(?P<user>\w+)/(?P<repo>\w+)/blob/(?P<commit>\w+)/(?P<path>[\w/.]+)#(?P<line>[\w-]+)? and it does the job. The size is mainly due to the named capture groups; otherwise it's pretty simple.

Notes: the link must be a permalink to a file having a line number or a line number range. Anything else will not work. Also, more importantly, permalinks inside hyperlinks should not be extracted (Need to see how best to do this).

Step 2: Get the lines

Github does not provide any API to just get the line(s) from a permalink. From the browser it makes a call to https://github.com/preview with some query params and it gets an HTML response. So I think internally, github server does something. So we need to rely on the contents API to give us the file contents and filter out the lines by ourselves.

Notes: The API only supports files of 1MB in size, which I think is a good thing. We don't want to fetch huge files just to show a code preview. Secondly, what if somebody posts 500 permalinks in a single message ? Some thought needs to be given on limiting the no. of requests.

Step 3: Construct the markdown

For now, I have a simple version where I create a link to the given file with a markdown code block showing the desired lines. We can iterate further on the design. I also check the file extension and add the appropriate language tag in the markdown.

I have a rough prototype ready:

Before:
before

After:
after

@marianunez / @hanzei - Does this approach sound reasonable to you ?

Side note: I see that for the /slash commands, the github client is constructed anew every time from the userID in the header. I think perhaps, it is better to wrap all usages of that in a sync.Once, so that the client is created only once and reused for future API calls.

from mattermost-plugin-github.

marianunez avatar marianunez commented on May 25, 2024

This looks great @agnivade! Only a couple of observations:

  1. The contents API seems to return a link to the source of the file where you would have to parse it in order to find the needed lines. One thing that I noticed is that within the permalink it does contain within the DOM the source code lines that Github uses to display. You could just look for the html items of the given lines - you would still need to do some parsing on them but may save us a Github API call. Just another option to explore:

Screen Shot 2019-10-03 at 2 23 02 PM

  1. Looking good 👌

from mattermost-plugin-github.

agnivade avatar agnivade commented on May 25, 2024

The contents API seems to return a link to the source of the file where you would have to parse it in order to find the needed lines.

Prima facie, that's what it looked like to me too. But it also returns a base64 encoded version of the file contents itself. See the "content" field. So it's just one Github call.

One thing that I noticed is that within the permalink it does contain within the DOM the source code lines that Github uses to display.

Didn't get this. Did you mean to use the github internal API (https://github.com/preview) to get the html ? That uses some undocumented query parameters like markdown_unsupported=false&repository=140969738&subject=8&subject_type=Issue and is subject to change any time from Github.

from mattermost-plugin-github.

marianunez avatar marianunez commented on May 25, 2024

Didn't get this. Did you mean to use the github internal API (https://github.com/preview) to get the html ?

I meant an actual GET Http Request to get the permalink HTML contents but you are right, it's one request either way.

from mattermost-plugin-github.

agnivade avatar agnivade commented on May 25, 2024

Right, and even then the underlying HTML structure can change anytime. I think it is safer and cleaner to use the API.

from mattermost-plugin-github.

marianunez avatar marianunez commented on May 25, 2024

Right, and even then the underlying HTML structure can change anytime. I think it is safer and cleaner to use the API.

Agreed!

Also, we should add a #4 to your list for having a setting to enable/disable this feature on the plugin.

from mattermost-plugin-github.

agnivade avatar agnivade commented on May 25, 2024

Also, we should add a #4 to your list for having a setting to enable/disable this feature on the plugin.

Yes absolutely.

Nearly done with it. Tested all scenarios. Seems to work. Remaining work is to add the config switch, write test cases and add more polish.

Btw: here's a nice visualization of the regex I am using - https://jex.im/regulex/#!cmd=export&flags=&re=https%3F%3A%2F%2F(www%5C.)%3Fgithub%5C.com%2F(%5Cw%2B)%2F(%5Cw%2B)%2Fblob%2F(%5Cw%2B)%2F(%5B%5Cw%2F.%5D%2B)%23(%5B%5Cw-%5D%2B)%3F.

One last note:
The current logic will expand links inside backtick blocks too. To get around this, we would need a full-blown markdown parser to parse every single message. That seems like a lot of work for messages which don't have any permalinks at all. And after that, we need a regex pass again to extract the links.

A better way might be to somehow achieve this on the client side, because the client is already doing the markdown parsing. But that would need passing the github token to the client (which might be a security issue). Github can do that because they own the pipeline end to end.

For now, I think we can go ahead with the current flow which is fast and simple; and circle back on further improvements depending on what the feedback is. At worst case, it is anyways killable with a switch.

from mattermost-plugin-github.

marianunez avatar marianunez commented on May 25, 2024

One last note:
The current logic will expand links inside backtick blocks too. To get around this, we would need a full-blown markdown parser to parse every single message. That seems like a lot of work for messages which don't have any permalinks at all. And after that, we need a regex pass again to extract the links.

A better way might be to somehow achieve this on the client side, because the client is already doing the markdown parsing. But that would need passing the github token to the client (which might be a security issue). Github can do that because they own the pipeline end to end.

For now, I think we can go ahead with the current flow which is fast and simple; and circle back on further improvements depending on what the feedback is. At worst case, it is anyways killable with a switch.

I agree. This border case can be handled as a separate PR as a fix/enhancement in the future.

from mattermost-plugin-github.

ThiefMaster avatar ThiefMaster commented on May 25, 2024

To get around this, we would need a full-blown markdown parser to parse every single message. That seems like a lot of work for messages which don't have any permalinks at all.

Is this really a problem? The autolink plugin already does this for example to avoid the - IMHO broken - problem of replacing stuff inside code blocks.

from mattermost-plugin-github.

agnivade avatar agnivade commented on May 25, 2024

There is certainly an overhead here to pass every single message through a markdown parser. Whether that overhead is acceptable or not is a different question, given that we don't have any SLAs around plugin response latency.

from mattermost-plugin-github.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.