mbox series

[RFC,0/9] Hirschmann Hellcreek DSA driver

Message ID 20200618064029.32168-1-kurt@linutronix.de
Headers show
Series Hirschmann Hellcreek DSA driver | expand

Message

Kurt Kanzenbach June 18, 2020, 6:40 a.m. UTC
Hi,

this series adds a DSA driver for the Hirschmann Hellcreek TSN switch
IP. Characteristics of that IP:

 * Full duplex Ethernet interface at 100/1000 Mbps on three ports
 * IEEE 802.1Q-compliant Ethernet Switch
 * IEEE 802.1Qbv Time-Aware scheduling support
 * IEEE 1588 and IEEE 802.1AS support

That IP is used e.g. in

 https://www.arrow.com/en/campaigns/arrow-kairos

Due to the hardware setup the switch driver is implemented using DSA. A special
tagging protocol is leveraged. Furthermore, this driver supports PTP, hardware
timestamping and TAPRIO offloading.

This work is part of the AccessTSN project: https://www.accesstsn.com/

If there are any objections let me know.

Thanks,
Kurt

Kamil Alkhouri (2):
  net: dsa: hellcreek: Add PTP clock support
  net: dsa: hellcreek: Add support for hardware timestamping

Kurt Kanzenbach (7):
  net: dsa: Add tag handling for Hirschmann Hellcreek switches
  net: dsa: Add DSA driver for Hirschmann Hellcreek switches
  net: dsa: hellcreek: Add TAPRIO offloading support
  net: dsa: hellcreek: Add debugging mechanisms
  net: dsa: hellcreek: Add PTP status LEDs
  dt-bindings: Add vendor prefix for Hirschmann
  dt-bindings: net: dsa: Add documentation for Hellcreek switches

 .../devicetree/bindings/net/dsa/hellcreek.txt |   72 +
 .../devicetree/bindings/vendor-prefixes.yaml  |    2 +
 drivers/net/dsa/Kconfig                       |    2 +
 drivers/net/dsa/Makefile                      |    1 +
 drivers/net/dsa/hirschmann/Kconfig            |    7 +
 drivers/net/dsa/hirschmann/Makefile           |    5 +
 drivers/net/dsa/hirschmann/hellcreek.c        | 1751 +++++++++++++++++
 drivers/net/dsa/hirschmann/hellcreek.h        |  302 +++
 .../net/dsa/hirschmann/hellcreek_hwtstamp.c   |  492 +++++
 .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |   58 +
 drivers/net/dsa/hirschmann/hellcreek_ptp.c    |  369 ++++
 drivers/net/dsa/hirschmann/hellcreek_ptp.h    |   76 +
 include/net/dsa.h                             |    2 +
 net/dsa/Kconfig                               |    6 +
 net/dsa/Makefile                              |    1 +
 net/dsa/tag_hellcreek.c                       |  101 +
 16 files changed, 3247 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/hellcreek.txt
 create mode 100644 drivers/net/dsa/hirschmann/Kconfig
 create mode 100644 drivers/net/dsa/hirschmann/Makefile
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek.c
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek.h
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_ptp.c
 create mode 100644 drivers/net/dsa/hirschmann/hellcreek_ptp.h
 create mode 100644 net/dsa/tag_hellcreek.c

Comments

Andrew Lunn June 18, 2020, 1:42 p.m. UTC | #1
On Thu, Jun 18, 2020 at 08:40:21AM +0200, Kurt Kanzenbach wrote:
> The Hirschmann Hellcreek TSN switches have a special tagging protocol for frames
> exchanged between the CPU port and the master interface. The format is a one
> byte trailer indicating the destination or origin port.
> 
> It's quite similar to the Micrel KSZ tagging. That's why the implementation is
> based on that code.
> 
> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
> ---
>  include/net/dsa.h       |   2 +
>  net/dsa/Kconfig         |   6 +++
>  net/dsa/Makefile        |   1 +
>  net/dsa/tag_hellcreek.c | 101 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 110 insertions(+)
>  create mode 100644 net/dsa/tag_hellcreek.c
> 
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 50389772c597..2784c4851d92 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -44,6 +44,7 @@ struct phylink_link_state;
>  #define DSA_TAG_PROTO_KSZ8795_VALUE		14
>  #define DSA_TAG_PROTO_OCELOT_VALUE		15
>  #define DSA_TAG_PROTO_AR9331_VALUE		16
> +#define DSA_TAG_PROTO_HELLCREEK_VALUE		17
>  
>  enum dsa_tag_protocol {
>  	DSA_TAG_PROTO_NONE		= DSA_TAG_PROTO_NONE_VALUE,
> @@ -63,6 +64,7 @@ enum dsa_tag_protocol {
>  	DSA_TAG_PROTO_KSZ8795		= DSA_TAG_PROTO_KSZ8795_VALUE,
>  	DSA_TAG_PROTO_OCELOT		= DSA_TAG_PROTO_OCELOT_VALUE,
>  	DSA_TAG_PROTO_AR9331		= DSA_TAG_PROTO_AR9331_VALUE,
> +	DSA_TAG_PROTO_HELLCREEK		= DSA_TAG_PROTO_HELLCREEK_VALUE,
>  };
>  
>  struct packet_type;
> diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
> index d5bc6ac599ef..edc0c3ab6a4e 100644
> --- a/net/dsa/Kconfig
> +++ b/net/dsa/Kconfig
> @@ -121,4 +121,10 @@ config NET_DSA_TAG_TRAILER
>  	  Say Y or M if you want to enable support for tagging frames at
>  	  with a trailed. e.g. Marvell 88E6060.
>  
> +config NET_DSA_TAG_HELLCREEK
> +	tristate "Tag driver for Hirschmann Hellcreek TSN switches"
> +	help
> +	  Say Y or M if you want to enable support for tagging frames
> +	  for the Hirschmann Hellcreek TSN switches.
> +

Hi Kurt

This file is roughly in alphabetic order based on the tristate
string. Please move this before "Tag driver for Lantiq / Intel GSWIP
switches" to keep with the sorting.

Otherwise this looks good.

	  Andrew
Andrew Lunn June 18, 2020, 3:22 p.m. UTC | #2
> +static void __hellcreek_select_port(struct hellcreek *hellcreek, int port)

Hi Kurt

In general, please can you drop all these __ prefixes. They are useful
when you have for example hellcreek_select_port() takes a lock and
then calls _hellcreek_select_port(), but here, there is no need for
them.

> +static int hellcreek_detect(struct hellcreek *hellcreek)
> +{
> +	u8 tgd_maj, tgd_min;
> +	u16 id, rel_low, rel_high, date_low, date_high, tgd_ver;
> +	u32 rel, date;

Reverse Christmas tree please. Here and everywhere else.

> +
> +	id	  = hellcreek_read(hellcreek, HR_MODID_C);
> +	rel_low	  = hellcreek_read(hellcreek, HR_REL_L_C);
> +	rel_high  = hellcreek_read(hellcreek, HR_REL_H_C);
> +	date_low  = hellcreek_read(hellcreek, HR_BLD_L_C);
> +	date_high = hellcreek_read(hellcreek, HR_BLD_H_C);
> +	tgd_ver   = hellcreek_read(hellcreek, TR_TGDVER);
> +
> +	if (id != HELLCREEK_MODULE_ID)
> +		return -ENODEV;
> +
> +	rel	= rel_low | (rel_high << 16);
> +	date	= date_low | (date_high << 16);
> +	tgd_maj = (tgd_ver & TR_TGDVER_REV_MAJ_MASK) >> TR_TGDVER_REV_MAJ_SHIFT;
> +	tgd_min = (tgd_ver & TR_TGDVER_REV_MIN_MASK) >> TR_TGDVER_REV_MIN_SHIFT;
> +
> +	dev_info(hellcreek->dev, "Module ID=%02x Release=%04x Date=%04x TGD Version=%02x.%02x\n",
> +		 id, rel, date, tgd_maj, tgd_min);
> +
> +	return 0;
> +}

> +static int hellcreek_port_enable(struct dsa_switch *ds, int port,
> +				 struct phy_device *phy)
> +{
> +	struct hellcreek *hellcreek = ds->priv;
> +	struct hellcreek_port *hellcreek_port = &hellcreek->ports[port];
> +	unsigned long flags;
> +	u16 val;
> +
> +	if (port >= NUM_PORTS)
> +		return -EINVAL;
> +
> +	dev_info(hellcreek->dev, "Enable port %d\n", port);

dev_dbg().

+
> +	spin_lock_irqsave(&hellcreek->reg_lock, flags);

I've not seen any interrupt handling code in the driver, and there is
no mention of an interrupt in the DT binding. Do you really need
_irqsave spin locks?

> +
> +	__hellcreek_select_port(hellcreek, port);
> +	val = hellcreek_port->ptcfg;
> +	val |= HR_PTCFG_ADMIN_EN;
> +	hellcreek_write(hellcreek, val, HR_PTCFG);
> +	hellcreek_port->ptcfg = val;
> +
> +	spin_unlock_irqrestore(&hellcreek->reg_lock, flags);
> +
> +	return 0;
> +}
> +
> +static void hellcreek_port_disable(struct dsa_switch *ds, int port)
> +{
> +	struct hellcreek *hellcreek = ds->priv;
> +	struct hellcreek_port *hellcreek_port = &hellcreek->ports[port];
> +	unsigned long flags;
> +	u16 val;
> +
> +	if (port >= NUM_PORTS)
> +		return;

I don't think this test is actually needed, here or in any of the
other callbacks. If it does happen, it means we have a core bug we
should fix.

> +
> +	dev_info(hellcreek->dev, "Disable port %d\n", port);

dev_dbg()

> +	spin_lock_irqsave(&hellcreek->reg_lock, flags);
> +
> +	__hellcreek_select_port(hellcreek, port);
> +	val = hellcreek_port->ptcfg;
> +	val &= ~HR_PTCFG_ADMIN_EN;
> +	hellcreek_write(hellcreek, val, HR_PTCFG);
> +	hellcreek_port->ptcfg = val;
> +
> +	spin_unlock_irqrestore(&hellcreek->reg_lock, flags);
> +}
> +

> +static void hellcreek_get_ethtool_stats(struct dsa_switch *ds, int port,
> +					uint64_t *data)
> +{
> +	struct hellcreek *hellcreek = ds->priv;
> +	struct hellcreek_port *hellcreek_port = &hellcreek->ports[port];
> +	unsigned long flags;
> +	int i;
> +
> +	if (port >= NUM_PORTS)
> +		return;
> +
> +	spin_lock_irqsave(&hellcreek->reg_lock, flags);
> +	for (i = 0; i < ARRAY_SIZE(hellcreek_counter); ++i) {
> +		const struct hellcreek_counter *counter = &hellcreek_counter[i];
> +		u8 offset = counter->offset + port * 64;
> +		u16 high, low;
> +		u64 value = 0;
> +
> +		__hellcreek_select_counter(hellcreek, offset);
> +
> +		high  = hellcreek_read(hellcreek, HR_CRDH);
> +		low   = hellcreek_read(hellcreek, HR_CRDL);
> +		value = (high << 16) | low;

Is there some sort of snapshot happening here? Or do you need to read
high again, to make sure it has not incremented when low was being
read?

> +
> +		hellcreek_port->counter_values[i] += value;
> +		data[i] = hellcreek_port->counter_values[i];
> +	}
> +	spin_unlock_irqrestore(&hellcreek->reg_lock, flags);
> +}
> +
> +static int hellcreek_vlan_prepare(struct dsa_switch *ds, int port,
> +				  const struct switchdev_obj_port_vlan *vlan)
> +{
> +	struct hellcreek *hellcreek = ds->priv;
> +
> +	/* Nothing todo */
> +	dev_info(hellcreek->dev, "VLAN prepare for port %d\n", port);

dev_dbg()

> +
> +	return 0;
> +}
> +

> +static void hellcreek_vlan_add(struct dsa_switch *ds, int port,
> +			       const struct switchdev_obj_port_vlan *vlan)
> +{
> +	struct hellcreek *hellcreek = ds->priv;
> +	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
> +	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
> +	unsigned long flags;
> +	u16 vid;
> +
> +	if (port >= NUM_PORTS || vlan->vid_end >= 4096)

Again, if vlan->vid_end >= 4096 is true, there is a core bug which
needs fixing.

> +		return;
> +
> +	dev_info(hellcreek->dev, "Add VLANs (%d -- %d) on port %d, %s, %s\n",
> +		 vlan->vid_begin, vlan->vid_end, port,
> +		 untagged ? "untagged" : "tagged",
> +		 pvid ? "PVID" : "no PVID");

Please go thought the driver and consider all you dev_info()
statements. Should they be dev_dbg()?

	    Andrew
Jakub Kicinski June 18, 2020, 3:46 p.m. UTC | #3
On Thu, 18 Jun 2020 08:40:23 +0200 Kurt Kanzenbach wrote:
> From: Kamil Alkhouri <kamil.alkhouri@hs-offenburg.de>
> 
> The switch has internal PTP hardware clocks. Add support for it. There are three
> clocks:
> 
>  * Synchronized
>  * Syntonized
>  * Free running
> 
> Currently the synchronized clock is exported to user space which is a good
> default for the beginning. The free running clock might be exported later
> e.g. for implementing 802.1AS-2011/2020 Time Aware Bridges (TAB). The switch
> also supports cross time stamping for that purpose.
> 
> The implementation adds support setting/getting the time as well as offset and
> frequency adjustments. However, the clock only holds a partial timeofday
> timestamp. This is why we track the seconds completely in software (see overflow
> work and last_ts).
> 
> Furthermore, add the PTP multicast addresses into the FDB to forward that
> packages only to the CPU port where they are processed by a PTP program.
> 
> Signed-off-by: Kamil Alkhouri <kamil.alkhouri@hs-offenburg.de>
> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>

Please make sure each patch in the series builds cleanly with the W=1
C=1 flags. Here we have:

../drivers/net/dsa/hirschmann/hellcreek_ptp.c: In function ‘__hellcreek_ptp_clock_read’:
../drivers/net/dsa/hirschmann/hellcreek_ptp.c:30:28: warning: variable ‘sech’ set but not used [-Wunused-but-set-variable]
   30 |  u16 nsl, nsh, secl, secm, sech;
      |                            ^~~~
../drivers/net/dsa/hirschmann/hellcreek_ptp.c:30:22: warning: variable ‘secm’ set but not used [-Wunused-but-set-variable]
   30 |  u16 nsl, nsh, secl, secm, sech;
      |                      ^~~~
../drivers/net/dsa/hirschmann/hellcreek_ptp.c:30:16: warning: variable ‘secl’ set but not used [-Wunused-but-set-variable]
   30 |  u16 nsl, nsh, secl, secm, sech;
      |                ^~~~

Next patch adds a few more.
Andrew Lunn June 18, 2020, 5:23 p.m. UTC | #4
> +static u64 __hellcreek_ptp_clock_read(struct hellcreek *hellcreek)
> +{
> +	u16 nsl, nsh, secl, secm, sech;
> +
> +	/* Take a snapshot */
> +	hellcreek_ptp_write(hellcreek, PR_COMMAND_C_SS, PR_COMMAND_C);
> +
> +	/* The time of the day is saved as 96 bits. However, due to hardware
> +	 * limitations the seconds are not or only partly kept in the PTP
> +	 * core. That's why only the nanoseconds are used and the seconds are
> +	 * tracked in software. Anyway due to internal locking all five
> +	 * registers should be read.
> +	 */
> +	sech = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> +	secm = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> +	secl = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> +	nsh  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> +	nsl  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> +
> +	return (u64)nsl | ((u64)nsh << 16);

Hi Kurt

What are the hardware limitations? There seems to be 48 bits for
seconds? That allows for 8925104 years?

> +static u64 __hellcreek_ptp_gettime(struct hellcreek *hellcreek)
> +{
> +	u64 ns;
> +
> +	ns = __hellcreek_ptp_clock_read(hellcreek);
> +	if (ns < hellcreek->last_ts)
> +		hellcreek->seconds++;
> +	hellcreek->last_ts = ns;
> +	ns += hellcreek->seconds * NSEC_PER_SEC;

So the assumption is, this gets called at least once per second. And
if that does not happen, there is no recovery. The second is lost.

I'm just wondering if there is something more robust using what the
hardware does provide, even if the hardware is not perfect.

	 Andrew
Andrew Lunn June 18, 2020, 5:34 p.m. UTC | #5
On Thu, Jun 18, 2020 at 08:40:26AM +0200, Kurt Kanzenbach wrote:
> The switch has registers which are useful for debugging issues:

debugfs is not particularly likes. Please try to find other means
where possible. Memory usage fits nicely into devlink. See mv88e6xxx
which exports the ATU fill for example. Are trace registers counters?

> +static int hellcreek_debugfs_init(struct hellcreek *hellcreek)
> +{
> +	struct dentry *file;
> +
> +	hellcreek->debug_dir = debugfs_create_dir(dev_name(hellcreek->dev),
> +						  NULL);
> +	if (!hellcreek->debug_dir)
> +		return -ENOMEM;

Just a general comment. You should not check the return value from any
debugfs call, since it is totally optional. It will also do the right
thing if the previous call has failed. There are numerous emails from
GregKH about this.

       Andrew
Andrew Lunn June 18, 2020, 5:46 p.m. UTC | #6
On Thu, Jun 18, 2020 at 08:40:27AM +0200, Kurt Kanzenbach wrote:
> The switch has two controllable I/Os which are usually connected to LEDs. This
> is useful to immediately visually see the PTP status.
> 
> These provide two signals:
> 
>  * is_gm
> 
>    This LED can be activated if the current device is the grand master in that
>    PTP domain.
> 
>  * sync_good
> 
>    This LED can be activated if the current device is in sync with the network
>    time.
> 
> Expose these via the LED framework to be controlled via user space
> e.g. linuxptp.

Hi Kurt

Is the hardware driving these signals at all? Or are these just
suggested names in the documentation? It would not be an issue to have
user space to configure them to use the heartbeat trigger, etc?

     Andrew
Kurt Kanzenbach June 19, 2020, 8:12 a.m. UTC | #7
Hi Andrew,

first off thank you for the review.

On Thu Jun 18 2020, Andrew Lunn wrote:
>> +static void __hellcreek_select_port(struct hellcreek *hellcreek, int port)
>
> Hi Kurt
>
> In general, please can you drop all these __ prefixes. They are useful
> when you have for example hellcreek_select_port() takes a lock and
> then calls _hellcreek_select_port(), but here, there is no need for
> them.

OK, sure.

>
>> +static int hellcreek_detect(struct hellcreek *hellcreek)
>> +{
>> +	u8 tgd_maj, tgd_min;
>> +	u16 id, rel_low, rel_high, date_low, date_high, tgd_ver;
>> +	u32 rel, date;
>
> Reverse Christmas tree please. Here and everywhere else.

OK.

>
>> +
>> +	id	  = hellcreek_read(hellcreek, HR_MODID_C);
>> +	rel_low	  = hellcreek_read(hellcreek, HR_REL_L_C);
>> +	rel_high  = hellcreek_read(hellcreek, HR_REL_H_C);
>> +	date_low  = hellcreek_read(hellcreek, HR_BLD_L_C);
>> +	date_high = hellcreek_read(hellcreek, HR_BLD_H_C);
>> +	tgd_ver   = hellcreek_read(hellcreek, TR_TGDVER);
>> +
>> +	if (id != HELLCREEK_MODULE_ID)
>> +		return -ENODEV;
>> +
>> +	rel	= rel_low | (rel_high << 16);
>> +	date	= date_low | (date_high << 16);
>> +	tgd_maj = (tgd_ver & TR_TGDVER_REV_MAJ_MASK) >> TR_TGDVER_REV_MAJ_SHIFT;
>> +	tgd_min = (tgd_ver & TR_TGDVER_REV_MIN_MASK) >> TR_TGDVER_REV_MIN_SHIFT;
>> +
>> +	dev_info(hellcreek->dev, "Module ID=%02x Release=%04x Date=%04x TGD Version=%02x.%02x\n",
>> +		 id, rel, date, tgd_maj, tgd_min);
>> +
>> +	return 0;
>> +}
>
>> +static int hellcreek_port_enable(struct dsa_switch *ds, int port,
>> +				 struct phy_device *phy)
>> +{
>> +	struct hellcreek *hellcreek = ds->priv;
>> +	struct hellcreek_port *hellcreek_port = &hellcreek->ports[port];
>> +	unsigned long flags;
>> +	u16 val;
>> +
>> +	if (port >= NUM_PORTS)
>> +		return -EINVAL;
>> +
>> +	dev_info(hellcreek->dev, "Enable port %d\n", port);
>
> dev_dbg().

These are debug messages. I'll change that in the complete code.

>
> +
>> +	spin_lock_irqsave(&hellcreek->reg_lock, flags);
>
> I've not seen any interrupt handling code in the driver, and there is
> no mention of an interrupt in the DT binding. Do you really need
> _irqsave spin locks?

Yes, I do. The TAPRIO offloading patch later adds hrtimers and that's
why it's needed. However, in this particular patch it is not required,
but I didn't want to change all spin lock calls later.

>
>> +
>> +	__hellcreek_select_port(hellcreek, port);
>> +	val = hellcreek_port->ptcfg;
>> +	val |= HR_PTCFG_ADMIN_EN;
>> +	hellcreek_write(hellcreek, val, HR_PTCFG);
>> +	hellcreek_port->ptcfg = val;
>> +
>> +	spin_unlock_irqrestore(&hellcreek->reg_lock, flags);
>> +
>> +	return 0;
>> +}
>> +
>> +static void hellcreek_port_disable(struct dsa_switch *ds, int port)
>> +{
>> +	struct hellcreek *hellcreek = ds->priv;
>> +	struct hellcreek_port *hellcreek_port = &hellcreek->ports[port];
>> +	unsigned long flags;
>> +	u16 val;
>> +
>> +	if (port >= NUM_PORTS)
>> +		return;
>
> I don't think this test is actually needed, here or in any of the
> other callbacks. If it does happen, it means we have a core bug we
> should fix.

Agreed.

>
>> +
>> +	dev_info(hellcreek->dev, "Disable port %d\n", port);
>
> dev_dbg()
>
>> +	spin_lock_irqsave(&hellcreek->reg_lock, flags);
>> +
>> +	__hellcreek_select_port(hellcreek, port);
>> +	val = hellcreek_port->ptcfg;
>> +	val &= ~HR_PTCFG_ADMIN_EN;
>> +	hellcreek_write(hellcreek, val, HR_PTCFG);
>> +	hellcreek_port->ptcfg = val;
>> +
>> +	spin_unlock_irqrestore(&hellcreek->reg_lock, flags);
>> +}
>> +
>
>> +static void hellcreek_get_ethtool_stats(struct dsa_switch *ds, int port,
>> +					uint64_t *data)
>> +{
>> +	struct hellcreek *hellcreek = ds->priv;
>> +	struct hellcreek_port *hellcreek_port = &hellcreek->ports[port];
>> +	unsigned long flags;
>> +	int i;
>> +
>> +	if (port >= NUM_PORTS)
>> +		return;
>> +
>> +	spin_lock_irqsave(&hellcreek->reg_lock, flags);
>> +	for (i = 0; i < ARRAY_SIZE(hellcreek_counter); ++i) {
>> +		const struct hellcreek_counter *counter = &hellcreek_counter[i];
>> +		u8 offset = counter->offset + port * 64;
>> +		u16 high, low;
>> +		u64 value = 0;
>> +
>> +		__hellcreek_select_counter(hellcreek, offset);
>> +
>> +		high  = hellcreek_read(hellcreek, HR_CRDH);
>> +		low   = hellcreek_read(hellcreek, HR_CRDL);
>> +		value = (high << 16) | low;
>
> Is there some sort of snapshot happening here? Or do you need to read
> high again, to make sure it has not incremented when low was being
> read?

Yes, there is. When the counter is selected via
__hellcreek_select_counter() both registers are locked. That's why
there's no need to read high again. I'll add a comment for clarity.

>
>> +
>> +		hellcreek_port->counter_values[i] += value;
>> +		data[i] = hellcreek_port->counter_values[i];
>> +	}
>> +	spin_unlock_irqrestore(&hellcreek->reg_lock, flags);
>> +}
>> +
>> +static int hellcreek_vlan_prepare(struct dsa_switch *ds, int port,
>> +				  const struct switchdev_obj_port_vlan *vlan)
>> +{
>> +	struct hellcreek *hellcreek = ds->priv;
>> +
>> +	/* Nothing todo */
>> +	dev_info(hellcreek->dev, "VLAN prepare for port %d\n", port);
>
> dev_dbg()
>
>> +
>> +	return 0;
>> +}
>> +
>
>> +static void hellcreek_vlan_add(struct dsa_switch *ds, int port,
>> +			       const struct switchdev_obj_port_vlan *vlan)
>> +{
>> +	struct hellcreek *hellcreek = ds->priv;
>> +	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
>> +	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
>> +	unsigned long flags;
>> +	u16 vid;
>> +
>> +	if (port >= NUM_PORTS || vlan->vid_end >= 4096)
>
> Again, if vlan->vid_end >= 4096 is true, there is a core bug which
> needs fixing.
>
>> +		return;
>> +
>> +	dev_info(hellcreek->dev, "Add VLANs (%d -- %d) on port %d, %s, %s\n",
>> +		 vlan->vid_begin, vlan->vid_end, port,
>> +		 untagged ? "untagged" : "tagged",
>> +		 pvid ? "PVID" : "no PVID");
>
> Please go thought the driver and consider all you dev_info()
> statements. Should they be dev_dbg()?

I'll do.

Thanks,
Kurt
Kurt Kanzenbach June 19, 2020, 8:13 a.m. UTC | #8
Hi Jakub,

On Thu Jun 18 2020, Jakub Kicinski wrote:
> On Thu, 18 Jun 2020 08:40:23 +0200 Kurt Kanzenbach wrote:
>> From: Kamil Alkhouri <kamil.alkhouri@hs-offenburg.de>
>> 
>> The switch has internal PTP hardware clocks. Add support for it. There are three
>> clocks:
>> 
>>  * Synchronized
>>  * Syntonized
>>  * Free running
>> 
>> Currently the synchronized clock is exported to user space which is a good
>> default for the beginning. The free running clock might be exported later
>> e.g. for implementing 802.1AS-2011/2020 Time Aware Bridges (TAB). The switch
>> also supports cross time stamping for that purpose.
>> 
>> The implementation adds support setting/getting the time as well as offset and
>> frequency adjustments. However, the clock only holds a partial timeofday
>> timestamp. This is why we track the seconds completely in software (see overflow
>> work and last_ts).
>> 
>> Furthermore, add the PTP multicast addresses into the FDB to forward that
>> packages only to the CPU port where they are processed by a PTP program.
>> 
>> Signed-off-by: Kamil Alkhouri <kamil.alkhouri@hs-offenburg.de>
>> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
>
> Please make sure each patch in the series builds cleanly with the W=1
> C=1 flags. Here we have:
>
> ../drivers/net/dsa/hirschmann/hellcreek_ptp.c: In function ‘__hellcreek_ptp_clock_read’:
> ../drivers/net/dsa/hirschmann/hellcreek_ptp.c:30:28: warning: variable ‘sech’ set but not used [-Wunused-but-set-variable]
>    30 |  u16 nsl, nsh, secl, secm, sech;
>       |                            ^~~~
> ../drivers/net/dsa/hirschmann/hellcreek_ptp.c:30:22: warning: variable ‘secm’ set but not used [-Wunused-but-set-variable]
>    30 |  u16 nsl, nsh, secl, secm, sech;
>       |                      ^~~~
> ../drivers/net/dsa/hirschmann/hellcreek_ptp.c:30:16: warning: variable ‘secl’ set but not used [-Wunused-but-set-variable]
>    30 |  u16 nsl, nsh, secl, secm, sech;
>       |                ^~~~
>
> Next patch adds a few more.

Sorry, I did test with C=1 only. I'll fix it.

Thanks,
Kurt
Kurt Kanzenbach June 19, 2020, 8:26 a.m. UTC | #9
Hi Andrew,

On Thu Jun 18 2020, Andrew Lunn wrote:
>> +static u64 __hellcreek_ptp_clock_read(struct hellcreek *hellcreek)
>> +{
>> +	u16 nsl, nsh, secl, secm, sech;
>> +
>> +	/* Take a snapshot */
>> +	hellcreek_ptp_write(hellcreek, PR_COMMAND_C_SS, PR_COMMAND_C);
>> +
>> +	/* The time of the day is saved as 96 bits. However, due to hardware
>> +	 * limitations the seconds are not or only partly kept in the PTP
>> +	 * core. That's why only the nanoseconds are used and the seconds are
>> +	 * tracked in software. Anyway due to internal locking all five
>> +	 * registers should be read.
>> +	 */
>> +	sech = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> +	secm = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> +	secl = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> +	nsh  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> +	nsl  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> +
>> +	return (u64)nsl | ((u64)nsh << 16);
>
> Hi Kurt
>
> What are the hardware limitations? There seems to be 48 bits for
> seconds? That allows for 8925104 years?

In theory, yes. Due to hardware hardware considerations only a few or
none of these bits are used for the seconds. The rest is zero. Meaning
that the wraparound is not 8925104 years, but at e.g. 8 seconds when
using 3 bits for the seconds.

I've discussed this the Hirschmann people and they suggested to use the
nanoseconds only. That's what I did here.

>
>> +static u64 __hellcreek_ptp_gettime(struct hellcreek *hellcreek)
>> +{
>> +	u64 ns;
>> +
>> +	ns = __hellcreek_ptp_clock_read(hellcreek);
>> +	if (ns < hellcreek->last_ts)
>> +		hellcreek->seconds++;
>> +	hellcreek->last_ts = ns;
>> +	ns += hellcreek->seconds * NSEC_PER_SEC;
>
> So the assumption is, this gets called at least once per second. And
> if that does not happen, there is no recovery. The second is lost.

Yes, exactly. If a single overflow is missed, then the time is wrong.

>
> I'm just wondering if there is something more robust using what the
> hardware does provide, even if the hardware is not perfect.

I don't think there's a more robust way to do this. The overflow period
is a second which should be enough time to catch the overflow even if
the system is loaded. We did long running tests for days and the
mechanism worked fine. We could also consider to move the delayed work
to a dedicated thread which could be run with real time (SCHED_FIFO)
priority. But, I didn't see the need for it.

Thanks,
Kurt
Kurt Kanzenbach June 19, 2020, 8:36 a.m. UTC | #10
Hi Andrew,

On Thu Jun 18 2020, Andrew Lunn wrote:
> On Thu, Jun 18, 2020 at 08:40:26AM +0200, Kurt Kanzenbach wrote:
>> The switch has registers which are useful for debugging issues:
>
> debugfs is not particularly likes. Please try to find other means
> where possible. Memory usage fits nicely into devlink. See mv88e6xxx
> which exports the ATU fill for example.

OK, I'll have a look at devlink and the mv88e6xxx driver to see if that
could be utilized.

> Are trace registers counters?

No. The trace registers provide bits for error conditions and if packets
have been dropped e.g. because of full queues or FCS errors, and so on.

>
>> +static int hellcreek_debugfs_init(struct hellcreek *hellcreek)
>> +{
>> +	struct dentry *file;
>> +
>> +	hellcreek->debug_dir = debugfs_create_dir(dev_name(hellcreek->dev),
>> +						  NULL);
>> +	if (!hellcreek->debug_dir)
>> +		return -ENOMEM;
>
> Just a general comment. You should not check the return value from any
> debugfs call, since it is totally optional. It will also do the right
> thing if the previous call has failed. There are numerous emails from
> GregKH about this.

OK.

Thanks,
Kurt
Kurt Kanzenbach June 19, 2020, 8:45 a.m. UTC | #11
Hi Andrew,

On Thu Jun 18 2020, Andrew Lunn wrote:
> On Thu, Jun 18, 2020 at 08:40:27AM +0200, Kurt Kanzenbach wrote:
>> The switch has two controllable I/Os which are usually connected to LEDs. This
>> is useful to immediately visually see the PTP status.
>> 
>> These provide two signals:
>> 
>>  * is_gm
>> 
>>    This LED can be activated if the current device is the grand master in that
>>    PTP domain.
>> 
>>  * sync_good
>> 
>>    This LED can be activated if the current device is in sync with the network
>>    time.
>> 
>> Expose these via the LED framework to be controlled via user space
>> e.g. linuxptp.
>
> Hi Kurt
>
> Is the hardware driving these signals at all? Or are these just
> suggested names in the documentation? It would not be an issue to have
> user space to configure them to use the heartbeat trigger, etc?

These are more like GPIOs. If a 1 is set into the register then the
hardware drives the signal to high. The names are from the
documentation:

 * sync_good: This signal indicates that the switch is in sync
 * is_gm:     This signal indicates that the switch is the grand master

However, these signals have to be set by user space. Most likely these
signals are connected to LEDs.

Thanks,
Kurt
Andrew Lunn June 19, 2020, 1:40 p.m. UTC | #12
On Fri, Jun 19, 2020 at 10:26:44AM +0200, Kurt Kanzenbach wrote:
> Hi Andrew,
> 
> On Thu Jun 18 2020, Andrew Lunn wrote:
> >> +static u64 __hellcreek_ptp_clock_read(struct hellcreek *hellcreek)
> >> +{
> >> +	u16 nsl, nsh, secl, secm, sech;
> >> +
> >> +	/* Take a snapshot */
> >> +	hellcreek_ptp_write(hellcreek, PR_COMMAND_C_SS, PR_COMMAND_C);
> >> +
> >> +	/* The time of the day is saved as 96 bits. However, due to hardware
> >> +	 * limitations the seconds are not or only partly kept in the PTP
> >> +	 * core. That's why only the nanoseconds are used and the seconds are
> >> +	 * tracked in software. Anyway due to internal locking all five
> >> +	 * registers should be read.
> >> +	 */
> >> +	sech = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	secm = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	secl = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	nsh  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	nsl  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +
> >> +	return (u64)nsl | ((u64)nsh << 16);
> >
> > Hi Kurt
> >
> > What are the hardware limitations? There seems to be 48 bits for
> > seconds? That allows for 8925104 years?
> 
> In theory, yes. Due to hardware hardware considerations only a few or
> none of these bits are used for the seconds. The rest is zero. Meaning
> that the wraparound is not 8925104 years, but at e.g. 8 seconds when
> using 3 bits for the seconds.

Please add this to the comment.

> >> +static u64 __hellcreek_ptp_gettime(struct hellcreek *hellcreek)
> >> +{
> >> +	u64 ns;
> >> +
> >> +	ns = __hellcreek_ptp_clock_read(hellcreek);
> >> +	if (ns < hellcreek->last_ts)
> >> +		hellcreek->seconds++;
> >> +	hellcreek->last_ts = ns;
> >> +	ns += hellcreek->seconds * NSEC_PER_SEC;
> >
> > So the assumption is, this gets called at least once per second. And
> > if that does not happen, there is no recovery. The second is lost.
> 
> Yes, exactly. If a single overflow is missed, then the time is wrong.
> 
> >
> > I'm just wondering if there is something more robust using what the
> > hardware does provide, even if the hardware is not perfect.
> 
> I don't think there's a more robust way to do this. The overflow period
> is a second which should be enough time to catch the overflow even if
> the system is loaded. We did long running tests for days and the
> mechanism worked fine. We could also consider to move the delayed work
> to a dedicated thread which could be run with real time (SCHED_FIFO)
> priority. But, I didn't see the need for it.

If the hardware does give you 3 working bits for the seconds, you
could make use of that for a consistency check. If nothing else, you
could do a

dev_err(dev, 'PTP time is FUBAR');

	     Andrew
Andrew Lunn June 19, 2020, 1:42 p.m. UTC | #13
> > Are trace registers counters?
> 
> No. The trace registers provide bits for error conditions and if packets
> have been dropped e.g. because of full queues or FCS errors, and so on.

Is there some documentation somewhere? A better understanding of what
they can do might help figure out the correct API.

     Andrew
Andrew Lunn June 19, 2020, 1:52 p.m. UTC | #14
On Fri, Jun 19, 2020 at 10:45:36AM +0200, Kurt Kanzenbach wrote:
> Hi Andrew,
> 
> On Thu Jun 18 2020, Andrew Lunn wrote:
> > On Thu, Jun 18, 2020 at 08:40:27AM +0200, Kurt Kanzenbach wrote:
> >> The switch has two controllable I/Os which are usually connected to LEDs. This
> >> is useful to immediately visually see the PTP status.
> >> 
> >> These provide two signals:
> >> 
> >>  * is_gm
> >> 
> >>    This LED can be activated if the current device is the grand master in that
> >>    PTP domain.
> >> 
> >>  * sync_good
> >> 
> >>    This LED can be activated if the current device is in sync with the network
> >>    time.
> >> 
> >> Expose these via the LED framework to be controlled via user space
> >> e.g. linuxptp.
> >
> > Hi Kurt
> >
> > Is the hardware driving these signals at all? Or are these just
> > suggested names in the documentation? It would not be an issue to have
> > user space to configure them to use the heartbeat trigger, etc?
> 
> These are more like GPIOs. If a 1 is set into the register then the
> hardware drives the signal to high. The names are from the
> documentation:
> 
>  * sync_good: This signal indicates that the switch is in sync
>  * is_gm:     This signal indicates that the switch is the grand master
> 
> However, these signals have to be set by user space. Most likely these
> signals are connected to LEDs.

Thanks

Since these are general purpose LEDs, you might want to look at

Documentation/devicetree/bindings/leds/common.yaml

and implement some of the common properties. The label should ideally
correspond to the text on the case, not what the datasheet says. So
getting it from DT is a good idea. Do Hirschmanns own cases use this
text on there front plate?

	    Andrew
Kurt Kanzenbach June 22, 2020, 11:52 a.m. UTC | #15
On Fri Jun 19 2020, Andrew Lunn wrote:
> On Fri, Jun 19, 2020 at 10:26:44AM +0200, Kurt Kanzenbach wrote:
>> Hi Andrew,
>> 
>> On Thu Jun 18 2020, Andrew Lunn wrote:
>> >> +static u64 __hellcreek_ptp_clock_read(struct hellcreek *hellcreek)
>> >> +{
>> >> +	u16 nsl, nsh, secl, secm, sech;
>> >> +
>> >> +	/* Take a snapshot */
>> >> +	hellcreek_ptp_write(hellcreek, PR_COMMAND_C_SS, PR_COMMAND_C);
>> >> +
>> >> +	/* The time of the day is saved as 96 bits. However, due to hardware
>> >> +	 * limitations the seconds are not or only partly kept in the PTP
>> >> +	 * core. That's why only the nanoseconds are used and the seconds are
>> >> +	 * tracked in software. Anyway due to internal locking all five
>> >> +	 * registers should be read.
>> >> +	 */
>> >> +	sech = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	secm = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	secl = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	nsh  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	nsl  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +
>> >> +	return (u64)nsl | ((u64)nsh << 16);
>> >
>> > Hi Kurt
>> >
>> > What are the hardware limitations? There seems to be 48 bits for
>> > seconds? That allows for 8925104 years?
>> 
>> In theory, yes. Due to hardware hardware considerations only a few or
>> none of these bits are used for the seconds. The rest is zero. Meaning
>> that the wraparound is not 8925104 years, but at e.g. 8 seconds when
>> using 3 bits for the seconds.
>
> Please add this to the comment.

I will, no problem.

>
>> >> +static u64 __hellcreek_ptp_gettime(struct hellcreek *hellcreek)
>> >> +{
>> >> +	u64 ns;
>> >> +
>> >> +	ns = __hellcreek_ptp_clock_read(hellcreek);
>> >> +	if (ns < hellcreek->last_ts)
>> >> +		hellcreek->seconds++;
>> >> +	hellcreek->last_ts = ns;
>> >> +	ns += hellcreek->seconds * NSEC_PER_SEC;
>> >
>> > So the assumption is, this gets called at least once per second. And
>> > if that does not happen, there is no recovery. The second is lost.
>> 
>> Yes, exactly. If a single overflow is missed, then the time is wrong.
>> 
>> >
>> > I'm just wondering if there is something more robust using what the
>> > hardware does provide, even if the hardware is not perfect.
>> 
>> I don't think there's a more robust way to do this. The overflow period
>> is a second which should be enough time to catch the overflow even if
>> the system is loaded. We did long running tests for days and the
>> mechanism worked fine. We could also consider to move the delayed work
>> to a dedicated thread which could be run with real time (SCHED_FIFO)
>> priority. But, I didn't see the need for it.
>
> If the hardware does give you 3 working bits for the seconds, you
> could make use of that for a consistency check. If nothing else, you
> could do a
>
> dev_err(dev, 'PTP time is FUBAR');

OK. I'll add a check for that.

Thanks,
Kurt
Kurt Kanzenbach June 22, 2020, 12:32 p.m. UTC | #16
On Fri Jun 19 2020, Andrew Lunn wrote:
>> > Are trace registers counters?
>> 
>> No. The trace registers provide bits for error conditions and if packets
>> have been dropped e.g. because of full queues or FCS errors, and so on.
>
> Is there some documentation somewhere? A better understanding of what
> they can do might help figure out the correct API.

No, not that I'm aware of.

Actually there are a few more debugging mechanisms and features which
should be exposed somehow. Here's the list:

 * Trace registers for the error conditions. This feature needs to be
   configured for which ports should be traced
 * Memory registers for indicating how many free page and meta pointers
   are available (read-only)
 * Limit registers for configuring:
   * Maximum memory limit per port
   * Reserved memory for critical traffic
   * Background traffic rate
   * Maximum queue depth
 * Re-prioritization of packets based on the ether type (not mac address)
 * Packet logging (-> retrieval of packet time stamps) based on port, traffic class and direction
 * Queue tracking

What API would be useful for these mechanisms?

Thanks,
Kurt
Andrew Lunn June 22, 2020, 1:55 p.m. UTC | #17
On Mon, Jun 22, 2020 at 02:32:28PM +0200, Kurt Kanzenbach wrote:
> On Fri Jun 19 2020, Andrew Lunn wrote:
> >> > Are trace registers counters?
> >> 
> >> No. The trace registers provide bits for error conditions and if packets
> >> have been dropped e.g. because of full queues or FCS errors, and so on.
> >
> > Is there some documentation somewhere? A better understanding of what
> > they can do might help figure out the correct API.
> 
> No, not that I'm aware of.
> 
> Actually there are a few more debugging mechanisms and features which
> should be exposed somehow. Here's the list:
> 
>  * Trace registers for the error conditions. This feature needs to be
>    configured for which ports should be traced
>  * Memory registers for indicating how many free page and meta pointers
>    are available (read-only)
>  * Limit registers for configuring:
>    * Maximum memory limit per port
>    * Reserved memory for critical traffic
>    * Background traffic rate
>    * Maximum queue depth
>  * Re-prioritization of packets based on the ether type (not mac address)
>  * Packet logging (-> retrieval of packet time stamps) based on port, traffic class and direction
>  * Queue tracking
> 
> What API would be useful for these mechanisms?

Hi Kurt

You should take a look at devlink. Many of these fit devlink
resources. Use that where it fits. But you might end up with an out of
tree debugfs patch for your own debugging work. We have something
similar of mv88e6xxx.

	Andrew
Vladimir Oltean June 22, 2020, 2:14 p.m. UTC | #18
Hi Kurt,

On Mon, 22 Jun 2020 at 15:34, Kurt Kanzenbach <kurt@linutronix.de> wrote:
>
>  * Re-prioritization of packets based on the ether type (not mac address)

This can be done by offloading a tc skbedit priority action, and a
"protocol" key (even though I don't understand why you need to mention
"not mac address".

>  * Packet logging (-> retrieval of packet time stamps) based on port, traffic class and direction

What does this mean? tcpdump can give you this, for traffic destined
to the CPU. Do you want to mirror/sample traffic to the CPU for debug?

>
> What API would be useful for these mechanisms?
>
> Thanks,
> Kurt

Thanks,
-Vladimir
Kurt Kanzenbach June 23, 2020, 6:04 a.m. UTC | #19
Hi Vladimir,

On Mon Jun 22 2020, Vladimir Oltean wrote:
> Hi Kurt,
>
> On Mon, 22 Jun 2020 at 15:34, Kurt Kanzenbach <kurt@linutronix.de> wrote:
>>
>>  * Re-prioritization of packets based on the ether type (not mac address)
>
> This can be done by offloading a tc skbedit priority action, and a
> "protocol" key (even though I don't understand why you need to mention
> "not mac address".

Thanks. That seems like it can be used. I did mention the mac address,
because the switch has two ways of doing a re-prioritization either by
fdb entries via mac addresses or by the high level inspection (HLI)
based on the ether type.

>
>>  * Packet logging (-> retrieval of packet time stamps) based on port, traffic class and direction
>
> What does this mean? tcpdump can give you this, for traffic destined
> to the CPU. Do you want to mirror/sample traffic to the CPU for debug?
>

The switch can capture timestamps (nanoseconds) when packets have been
transmitted or received on all ports. The timestamps are stored in a
FIFO and can be retrieved later. As you said tcpdump only works for
packets at the CPU port. This feature is useful for debugging latency
issues for specific traffic flows.

Thanks,
Kurt
Kurt Kanzenbach June 23, 2020, 6:07 a.m. UTC | #20
On Mon Jun 22 2020, Andrew Lunn wrote:
> On Mon, Jun 22, 2020 at 02:32:28PM +0200, Kurt Kanzenbach wrote:
>> On Fri Jun 19 2020, Andrew Lunn wrote:
>> >> > Are trace registers counters?
>> >> 
>> >> No. The trace registers provide bits for error conditions and if packets
>> >> have been dropped e.g. because of full queues or FCS errors, and so on.
>> >
>> > Is there some documentation somewhere? A better understanding of what
>> > they can do might help figure out the correct API.
>> 
>> No, not that I'm aware of.
>> 
>> Actually there are a few more debugging mechanisms and features which
>> should be exposed somehow. Here's the list:
>> 
>>  * Trace registers for the error conditions. This feature needs to be
>>    configured for which ports should be traced
>>  * Memory registers for indicating how many free page and meta pointers
>>    are available (read-only)
>>  * Limit registers for configuring:
>>    * Maximum memory limit per port
>>    * Reserved memory for critical traffic
>>    * Background traffic rate
>>    * Maximum queue depth
>>  * Re-prioritization of packets based on the ether type (not mac address)
>>  * Packet logging (-> retrieval of packet time stamps) based on port, traffic class and direction
>>  * Queue tracking
>> 
>> What API would be useful for these mechanisms?
>
> Hi Kurt
>
> You should take a look at devlink. Many of these fit devlink
> resources. Use that where it fits. But you might end up with an out of
> tree debugfs patch for your own debugging work. We have something
> similar of mv88e6xxx.

I see. Maybe I'll keep the debug stuff out of tree for now and will come
back to it later.

Thanks,
Kurt
Richard Cochran June 24, 2020, 1:03 p.m. UTC | #21
On Thu, Jun 18, 2020 at 08:40:23AM +0200, Kurt Kanzenbach wrote:

> diff --git a/drivers/net/dsa/hirschmann/hellcreek.h b/drivers/net/dsa/hirschmann/hellcreek.h
> index a08a10cb5ab7..2d4422fd2567 100644
> --- a/drivers/net/dsa/hirschmann/hellcreek.h
> +++ b/drivers/net/dsa/hirschmann/hellcreek.h
> @@ -234,10 +234,17 @@ struct hellcreek_fdb_entry {
>  struct hellcreek {
>  	struct device *dev;
>  	struct dsa_switch *ds;
> +	struct ptp_clock *ptp_clock;
> +	struct ptp_clock_info ptp_clock_info;
>  	struct hellcreek_port ports[4];
> +	struct delayed_work overflow_work;
>  	spinlock_t reg_lock;	/* Switch IP register lock */
> +	spinlock_t ptp_lock;	/* PTP IP register lock */

Why use a spin lock and not a mutex?

>  	void __iomem *base;
> +	void __iomem *ptp_base;
>  	u8 *vidmbrcfg;		/* vidmbrcfg shadow */
> +	u64 seconds;		/* PTP seconds */
> +	u64 last_ts;		/* Used for overflow detection */
>  	size_t fdb_entries;
>  };

> +static int hellcreek_ptp_gettime(struct ptp_clock_info *ptp,
> +				 struct timespec64 *ts)
> +{
> +	struct hellcreek *hellcreek = ptp_to_hellcreek(ptp);
> +	u64 ns;
> +
> +	spin_lock(&hellcreek->ptp_lock);

Won't a mutex do here instead?

> +	ns = __hellcreek_ptp_gettime(hellcreek);
> +	spin_unlock(&hellcreek->ptp_lock);
> +
> +	*ts = ns_to_timespec64(ns);
> +
> +	return 0;
> +}

> +static int hellcreek_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
> +{
> +	struct hellcreek *hellcreek = ptp_to_hellcreek(ptp);
> +	u16 negative = 0, counth, countl;
> +	u32 count_val;
> +
> +	/* If the offset is larger than IP-Core slow offset resources. Don't
> +	 * concider slow adjustment. Rather, add the offset directly to the

consider

> +	 * current time
> +	 */
> +	if (abs(delta) > MAX_SLOW_OFFSET_ADJ) {
> +		struct timespec64 now, then = ns_to_timespec64(delta);
> +
> +		hellcreek_ptp_gettime(ptp, &now);
> +		now = timespec64_add(now, then);
> +		hellcreek_ptp_settime(ptp, &now);
> +
> +		return 0;
> +	}
> +
> +	if (delta < 0) {
> +		negative = 1;
> +		delta = -delta;
> +	}
> +
> +	/* 'count_val' does not exceed the maximum register size (2^30) */
> +	count_val = div_s64(delta, MAX_NS_PER_STEP);
> +
> +	counth = (count_val & 0xffff0000) >> 16;
> +	countl = count_val & 0xffff;
> +
> +	negative = (negative << 15) & 0x8000;
> +
> +	spin_lock(&hellcreek->ptp_lock);
> +
> +	/* Set offset write register */
> +	hellcreek_ptp_write(hellcreek, negative, PR_CLOCK_OFFSET_C);
> +	hellcreek_ptp_write(hellcreek, MAX_NS_PER_STEP, PR_CLOCK_OFFSET_C);
> +	hellcreek_ptp_write(hellcreek, MIN_CLK_CYCLES_BETWEEN_STEPS,
> +			    PR_CLOCK_OFFSET_C);
> +	hellcreek_ptp_write(hellcreek, countl,  PR_CLOCK_OFFSET_C);
> +	hellcreek_ptp_write(hellcreek, counth,  PR_CLOCK_OFFSET_C);
> +
> +	spin_unlock(&hellcreek->ptp_lock);
> +
> +	return 0;
> +}

> +int hellcreek_ptp_setup(struct hellcreek *hellcreek)
> +{
> +	u16 status;
> +
> +	/* Set up the overflow work */
> +	INIT_DELAYED_WORK(&hellcreek->overflow_work,
> +			  hellcreek_ptp_overflow_check);
> +
> +	/* Setup PTP clock */
> +	hellcreek->ptp_clock_info.owner = THIS_MODULE;
> +	snprintf(hellcreek->ptp_clock_info.name,
> +		 sizeof(hellcreek->ptp_clock_info.name),
> +		 dev_name(hellcreek->dev));
> +
> +	/* IP-Core can add up to 0.5 ns per 8 ns cycle, which means
> +	 * accumulator_overflow_rate shall not exceed 62.5 MHz (which adjusts
> +	 * the nominal frequency by 6.25%)
> +	 */
> +	hellcreek->ptp_clock_info.max_adj   = 62500000;
> +	hellcreek->ptp_clock_info.n_alarm   = 0;
> +	hellcreek->ptp_clock_info.n_pins    = 0;
> +	hellcreek->ptp_clock_info.n_ext_ts  = 0;
> +	hellcreek->ptp_clock_info.n_per_out = 0;
> +	hellcreek->ptp_clock_info.pps	    = 0;
> +	hellcreek->ptp_clock_info.adjfine   = hellcreek_ptp_adjfine;
> +	hellcreek->ptp_clock_info.adjtime   = hellcreek_ptp_adjtime;
> +	hellcreek->ptp_clock_info.gettime64 = hellcreek_ptp_gettime;
> +	hellcreek->ptp_clock_info.settime64 = hellcreek_ptp_settime;
> +	hellcreek->ptp_clock_info.enable    = hellcreek_ptp_enable;
> +
> +	hellcreek->ptp_clock = ptp_clock_register(&hellcreek->ptp_clock_info,
> +						  hellcreek->dev);
> +	if (IS_ERR(hellcreek->ptp_clock))
> +		return PTR_ERR(hellcreek->ptp_clock);

The ptp_clock_register() can also return NULL:

 * Returns a valid pointer on success or PTR_ERR on failure.  If PHC
 * support is missing at the configuration level, this function
 * returns NULL, and drivers are expected to gracefully handle that
 * case separately.

> +
> +	/* Enable the offset correction process, if no offset correction is
> +	 * already taking place
> +	 */
> +	status = hellcreek_ptp_read(hellcreek, PR_CLOCK_STATUS_C);
> +	if (!(status & PR_CLOCK_STATUS_C_OFS_ACT))
> +		hellcreek_ptp_write(hellcreek,
> +				    status | PR_CLOCK_STATUS_C_ENA_OFS,
> +				    PR_CLOCK_STATUS_C);
> +
> +	/* Enable the drift correction process */
> +	hellcreek_ptp_write(hellcreek, status | PR_CLOCK_STATUS_C_ENA_DRIFT,
> +			    PR_CLOCK_STATUS_C);
> +
> +	schedule_delayed_work(&hellcreek->overflow_work,
> +			      HELLCREEK_OVERFLOW_PERIOD);
> +
> +	return 0;
> +}

Thanks,
Richard
Kurt Kanzenbach June 25, 2020, 7:08 a.m. UTC | #22
Hi Richard,

On Wed Jun 24 2020, Richard Cochran wrote:
> On Thu, Jun 18, 2020 at 08:40:23AM +0200, Kurt Kanzenbach wrote:
>
>> diff --git a/drivers/net/dsa/hirschmann/hellcreek.h b/drivers/net/dsa/hirschmann/hellcreek.h
>> index a08a10cb5ab7..2d4422fd2567 100644
>> --- a/drivers/net/dsa/hirschmann/hellcreek.h
>> +++ b/drivers/net/dsa/hirschmann/hellcreek.h
>> @@ -234,10 +234,17 @@ struct hellcreek_fdb_entry {
>>  struct hellcreek {
>>  	struct device *dev;
>>  	struct dsa_switch *ds;
>> +	struct ptp_clock *ptp_clock;
>> +	struct ptp_clock_info ptp_clock_info;
>>  	struct hellcreek_port ports[4];
>> +	struct delayed_work overflow_work;
>>  	spinlock_t reg_lock;	/* Switch IP register lock */
>> +	spinlock_t ptp_lock;	/* PTP IP register lock */
>
> Why use a spin lock and not a mutex?

No particular reason. Mutex will also work.

>> +	hellcreek->ptp_clock = ptp_clock_register(&hellcreek->ptp_clock_info,
>> +						  hellcreek->dev);
>> +	if (IS_ERR(hellcreek->ptp_clock))
>> +		return PTR_ERR(hellcreek->ptp_clock);
>
> The ptp_clock_register() can also return NULL:
>
>  * Returns a valid pointer on success or PTR_ERR on failure.  If PHC
>  * support is missing at the configuration level, this function
>  * returns NULL, and drivers are expected to gracefully handle that
>  * case separately.
>

I see, thanks. I guess we could add the missing NULL checks to remove()
and get_ts_info() to handle that case gracefully.

Thanks,
Kurt